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The present invention is situated in the 
field of design of systems. More specifically, the present 
invention is related to a design apparatus for digital 
systems, generating implementable descriptions of said 
15 systems. 

The present invention is also related to a 
method for generating implementable descriptions of said 
systems. 

20 State of ^j-y 

The current need for digital systems forces 
contemporary system designers with ever increasing design 
complexities in most applications where dedicated 
processors and other digital hardware are used, demand for 

25 new systems is rising and development time is shortening. 
As an example, currently there is a high interest in 
digital communication equipment for public access networks. 
Examples are modems for Asymmetric Digital Subscriber Loop 
(ADSL) applications, and up- and downstream Hybrid Fiber- 

30 Coax (HFC) communication. These modems are preferably 
implemented in all-digital hardware using digital signal 
processing (DSP) techniques. This is because of the 



• 



complexity of the data processing that they require. 
Besides this, these systems also need short development 
cycles. This calls for a design methodology that starts at 
high level and that provides for design automation as much 
5 as possible. 

One frequently used modeling description 
language is VHDL (VHSIC Hardware Description Language) , 
which has been accepted as an IEEE standard since 1987. 
VHDL is a programming environment that produces a 

10 description of a piece of hardware. Additions to standard 
VHDL can be to implement features of Object Oriented 
Programming Languages into VHDL. This was described in the 
paper OO-VHDL (Computer, October 1995, pages 18-26) . 
Another frequently used modeling description language is 

15 VERILOG. 

A number of commercially available system 
environments support the design of complex DSP systems. 

MATLAB of Mathworks Inc offers the 
possibility of exploration at the algorithmic level. It 

20 uses the data-vector as the basic semantical feature. 
However, the developed MATLAB description has no 
relationship to a digital hardware implementation, nor does 
MATLAB support the synthesis of digital circuits. 

SPW of Alta Group offers a toolkit for the 

25 simulation of these kind of systems. SPW is typically used 
to simulate data-flow semantics. Data-flow semantics define 
explicit algorithmic iteration, whereas data-vector 
semantics do not. SPW relies on an extensive library and 
toolkit to develop systems. Unlike MATLAB, the initial 

30 description is a block-based description. Each block used 
in the systems appears in two different formats, (a 
simulatable and a synthesizable version) which results in 



possible inconsistency, 

COSSAP of Synopsys performs the same kind of 
system exploration as SPW. 

DC and BC are products of Synopsys that 
support system synthesis. These products do not provide 
sufficient algorithm exploration functions. 

Because all of these tools support only part 
of the desired functionality, contemporary digital systems 
are designed typically with a mix of these environments. 
For example, a designer might do algorithmic exploration in 
MATLAB, then do architecture definition with SPW, and 
finally map the architecture definition to an 
implementation in DC. 



Aims o f the invention 

It is an aim of the present invention to 
disclose a design apparatus that allows to generate from a 
behavioral description of a digital system, an 
implementable description for said system. 

It is another aim of the present invention to 
disclose a the design apparatus that allows for design, 
digital systems starting from a data vector or data flow 
description and generating an implementable level such as 
VHDL. A further aim is to perform such design tasks within 
one object oriented environment. 

Another aim is to provide a means comprised 
in said design apparatus for simulating the behavior of the 
system at any level of the design stage or trajectory. 



Symmarv of the invention 

A first aspect of the present invention 
concerns a design apparatus con^iled on a computer 
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environment for generating from a behavioral description of 
a system comprising at least one digital system part, an 
implementable description for said system, said behavioral 
description being represented on said computer environment 
5 as a first set of objects with a first set of relations 
therebetween, said implementable description being 
represented on said computer environment as a second set of 
objects with a second set of relations therebetween, said 
first and second set of objects being part of a design 
10 environment . 

A behavioral description is a description 
which substantiates the desired behavior of a system in a 
formal way. In general, a behavioral description is not 
readily implementable since it is a high-level description, 

15 and it only describes an abstract version of the system 
that can be simulated. An implementable description is a 
more concrete description that is, in contrast to a 
behavioral description, detailed enough to be implemented 
in software to provide an approximative simulation of real- 

20 life behavior or in hardware to provide a working 
semiconductor circuit . 

With design environment is meant an 
environment in which algorithms can be produced and run by 
inteirpretion or compilation. 

25 With objects is meant a data structure which 

shows all the characteristics of an object from an object 
oriented programming language, such as described in "Object 
Oriented Design" (G. Booch, Benjamin/Cummings Publishing, 
Redwood City, Calif., 1991). 

30 Said first and second set of objects are 

preferably part of a single design environment . 

Said design environment comprises preferably 



an Object Oriented Progranuning Language (OOPL) . Said OOPL 
can be C++. 

Said design environment is preferably an open 
environment wherein new objects can be created. A closed 
environment will not provide the flexibility that can be 
obtained with an open environment and will limit the 
possibilities of the user. 

Preferably, at least part of the input 
signals and output signals of said first set of objects are 
at least part of the input signals and output signals of 
said second set of objects. Essentially all of the input 
signals and output signals of said first set of objects can 
be essentially all of the input signals and output signals 
of said second set of objects. 

At least part of the input signals and output 
signals of said behavioral description are preferably at 
least part of the input signals and output signals of said 
implementable description. Essentially all of the input 
signals and output signals of said behavioral description 
can be essentially all of the input signals and output 
signals of said implementable description. 

Said first set of objects has preferably 
first semantics and said second set of objects has 
preferably second semantics. With semantics is meant the 
model of computation. Said first semantics is preferably a 
data-vector model and/or a data-flow model. Said second 
semantics is preferably a Finite State Machine Data Path 
(FSMD) data structure, comprising a control part and a 
data processing part, the data processing part being 
modeled by a signal flow graph (SFG) data structure and the 
control part being modeled by a FSM data structure. The 
terms FSMD and SFr are used interchangeably throughout the 



text . 

Preferably, the impact in said implementable 
description of at least a part of the objects of said 
second set of objects is essentially the same as the impact 
5 in said behavioral description of at least a part of the 
objects of said first set of objects. 

Preferably, the impact in said implementable 
description of essentially all of the objects of said 
second set of objects is essentially the same as the impact 
10 in said behavioral description of essentially all of the 
objects of said first set of objects. 

With impact is meant not only the function, 
but also the way the object interacts with its environment 
from an external point of view. A way of rephrasing this is 
15 that the same interface for providing input and collecting 
output is present. This does not mean that the actual 
implementation of the data-processing between input and 
output is the same. The implementation is embodied by 
objects, which can be completely different but perform a 
20 same function. In an OOPL , the use of methods of an object 
without knowing its actual implementation is referred to as 
information hiding. 

The design apparatus preferably further 
comprises means for simulating the behavior of said system 

25 said means simulating the behavior of said behavioral 
description, said implementable description or any 
intermediate description therebetween. Said intermediate 
description can be obtained after one or several refining 
steps from said behavioral description. 

^° Preferably, at least part of said second set 

of objects is derived from objects belonging to said first 
set of objects. This can be done by using the inheritance 
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functionalities provided in an OOPL, Essentially all of 
said second set of objects can be derived from objects 
belonging to said first set of objects. 

Said implementable description can be at 
5 least partly obtained by refining said behavioral 
description. Said implementable description can be 
essentially obtained by refining said behavioral 
description. Preferably, said refining comprises the 
refining of objects. 

The design apparatus can further comprise 
means to derive said first set of objects from a vector 
description, preferably a MATLAB description, describing 
said system as a set of operations on data vectors, means 
for simulating statically or demand-driven scheduled 
15 dataflow on said dataflow description and/or means for 
clock-cycle true simulating said digital system using said 
dataflow description and/or one or more of said SFG data 
structures . 

In a preferred embodiment, said implementable 
2 0 description is an architecture description of said system, 
said system advantageously further comprising means for 
translating said architecture description into a 
synthesizable description of said system, said 
synthesizable description being directly implementable in 
25 hardware. Said synthesizable description is preferably a 
netlist of hardware building blocks. Said hardware is 
preferably a semiconductor chip or a electronic circuit 
comprising semiconductor chips. 

A synthesizable description is a description 
30 of the architecture of a semiconductor that can be 
synthesized without further processing of the description. 
An example is a VHDL description. 
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Said means for translating said architecture 
description into a synt-he sizable description can be 
Cathedral -3 or Synopsys DC. 

A second aspect of the present invention is a 
method for designing a system comprising at least one 
digital part, comprising a refining step wherein a 
behavioral description of said system is transformed into 
an implementable description of said system, said 
behavioral description being represented as a first set of 
objects with a first set of relations therebetween and said 
implementable description being represented as a second set 
of objects with a second set of relations therebetween. 

Said refining step preferably comprises 
translating behavioral characteristics at least partly into 
structural characteristics. Said refining step can comprise 
translating behavioral characteristics completely into 
structural characteristics . 

Said method can further comprise a simulation 
step in which the behavior of said behavioral description, 
said implementable description and/or any intermediate 
description therebetween is simulated. 

Said refining step can comprises the addition 
of new objects, permitting interaction with existing 
objects, and adjustments to said existing objects allowing 
said interaction. 

Preferably, said refining step is performed 
in an open environment and comprises expansion of existing 
objects. Expansion of existing objects can include the 
addition to an object of methods that create new objects. 
Said object is said to be expanded with the new objects. 
The use of expandable objects allows to use meta-code 



generation: creating expandable objects implies an indirect 
creation of the new objects. 

Said behavioral description and said 
implementable description are preferably represented in a 
single design environment, said single design environment 
advantageously being an Object Oriented Programming 
Language, preferably C++, 

Preferably, said first set of objects has 
first semantics and said second set of objects has second 
semantics. Said first semantics is preferably a data-vector 
model and/or a data-flow model. Said second semantics is 
preferably an SFG data structure. 

The refining step comprises preferably a 
first refining step wherein said behavioral description 
being a data-vector model is at least partly transformed 
into a data-flow model. Advantageously, said data-flow 
model is an untimed floating point data-flow model. 

Said refining step preferably further 
comprises a second refining step wherein said data-flow 
model is at least partly transformed into an SFG model. 
Said data-flow model can be completely transformed into an 
SFG model. 

In a preferred embodiment, said first 
refining step comprises the steps of determining the input 
vector lengths of input, output and intermediate signals, 
determining the amount of parallelism of operations that 
process input signals under the form of a vector to output 
signals, determination of objects, connections between 
objects and signals between objects of said data-flow 
model, and determining the wordlength of said signals 
between objects. In the sequel of this application, the 
term "actors" is also used to denote objects. Connections 
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between objects are denoted as "edges" and signals between 
objects are denoted as "tokens". Said step of determining 
the amount of parallelism can preferably comprise 
determining the amount of parallelism for every data vector 
5 and reducing the unspecified communication bandwidth of 
said data-vector model to a fixed number of communication 
buses in said data-flow model. Said step of determination 
of actors, edges and tokens of said data-flow model 
preferably comprises defining one or a group of data 

10 vectors in said first data-vector model as actors; defining 
data precedences crossing actor bounds, as edges, said 
edges behaving like queues and transporting tokens between 
actors; construct a system schedule and run a simulation on 
a computer environment. Said second refining step comprises 

15 preferably transforming said tokens from floating point to 
fixed point. Preferably, said SFG model is a timed fixed 
point SFG model. 

Said second set of objects with said second 
set of relations therebetween are preferably at least 

20 partly derived from said first set of objects with said 
first set of relations therebetween. Objects belonging to 
said second set of objects are preferably new objects, 
identical with and/or derived by inheritance from objects 
from said first set of objects, or a combination thereof. 

25 Several of said SFG models can be combined 

with a finite state machine description resulting in an 
implementable description. Said implementable description 
can be transformed to synthesizable code, said 
synthesizable code preferably being VHDL code, 

30 Another aspect of the present invention is a 

method for simulating a system, wherein a description of a 
system is transformed into compilable C++ code. 
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Preferably, said description is an SFG data 
structure and said compilable C++ code is used to perform 
clock cycle true simulations. 

Several SFG data structures can be combined 
5 with a finite state machine description resulting in an 
implementable description, said implementable description 
being said compilable C++ code suitable for simulating said 
system as software. 

A clock-cycle true simulation of a system 
10 uses one or more SFG data structures. 

Said clock-cycle true simulation can be an 
expectation-based simulation, said expectation-based 
simulation comprising the steps of: annotating a token age 

15 to eveiry token; annotating a queue age to every queue; 
increasing token age according to the token aging rules and 
with the travel delay for every queue that has transported 
the token; increasing queue age with the iteration time of 
the actor steering the queue, and; checking whether token 

20 age is never smaller than queue age throughout the 
simulation. 

Another aspect of the present invention is a 
hardware circuit or a software simulation of a hardware 
circuit designed with the design apparatus as recited 
25 higher. 

Another aspect of the present invention is a 
hardware circuit or a software simulation of a hardware 
circuit designed with the method as recited higher. 

30 Detailed description of the invention 

The present invention will be further 
explained by means of examples, which does not limit the 
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scope of the invention as claimed. 
Short deBcription of the drawings 

In figures lA, IB, IC and ID, the overall design 
5 methodology according to an embodiment of the invention is 
described. 

In figure 2, a targeted architecture of a system that is to 
be designed according to the invention is described. 
In figure 3, the C++ modeling levels of target architecture 
10 are depicted. 

In figure 4, an SDF model of the PN correlator of the 
target architecture of figure 2 is shown. 

In figure 5, a CSDF model of the PN correlator is 
described. 

15 In figure 6, a MATLAB Dataflow model of the PN correlator 
is shown. 

In figure 7, the SFG modeling concepts are depicted. 

In figure 8, the implied description of the max actor is 
described. 

20 In figure 9, example implementations for different 
expectations are given. 

In figure 10, an overview of expectation based simulation 
is shown. 

In figure 11, the code in OCAPI, or design environment of 
25 the invention, for a correlator processor is given. 

In figure 12, the resulting circuit for datapath and 

controller is hierarchically drawn. 

Figure 13 describes a DECT Base station setup. 

Figure 14 shows the front -end processing of the DECT 
30 transceiver. 

In Figure 15, a part of the central VLIW controller 

description for the DECT transceiver ASIC is shown. 
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In figure 16, the use of overloading to construct the 
signal flowgraph data structure is shown. 

In figure 17, an example C++ code fragment and its 
corresponding data structure is described. 
5 In figure 18, a graphical and C++- textual description of 
the same FSM is shown. 

In figure 19, the final system architecture of the DECT 
transceiver is shown. 

In figure 20, a data-flow target architecture is shown. 
10 In figure 21, the simulation of one cycle in a system with 
three components is shown. 

In figure 22, the implementation and simulation strategy is 
depicted. 

In figure 23, an end-to-end model of a QAM transmission 
15 system is shown. 

In figure 24, the system contents for the QAM transmission 
system is described. 



The present invention can be described as a 
20 design environment for performing subsequent gradual 
refinement of descriptions of digital systems within one 
and the same object oriented programming language 
environment- The lowest level is semantically equivalent to 
a behavioral description at the register transfer (RT) 
25 level. 

A preferred embodiment of the invention 
comprising the design method according to the invention is 
called OCAPI. OCAPI is part of a global design methodology 
concept SOC++. OCAPI includes both a design environment in 
30 an object oriented programming language and a design 
method. OCAPI differentiates from current systems that 
support architecture definition (SPW, COSSAP) in the way 
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that a designer is guided from the MATLAB level to the 
register transfer level. This way, combined semantic and 
syntactic translations in the design flow are avoided. 

• The designer is offered a single coding framework in an 
object oriented programming language, such as C++, to 
express refinements to the behavior. An open environment 
is used, rather than the usual interf ace-and-module 
approach. 

• The coding framework is a container of design concepts, 
used in traditional design practice. Some example design 
concepts currently supported are simulation queues, 
finite state machines, signal flowgraphs, hybrid 
floating/fixed point data types, operation profiling and 
signal range statistics. The concepts take the form of 
object oriented programming language objects (referred to 
as object in the remainder of this text), that can be 
instantiated and related to each other. 

• With this set of objects, a gradual refinement design 
route is offered: more abstract design concepts can be 
replaced with more detailed ones in a gradual way. Also, 
design concepts are combined in an orthogonal way: 
quantization effects and clock cycles (operation/operator 
mapping) for instance are two architecture features that 
can be investigated separately. Next, the different 
design hierarchies can be freely intermixed because of 
this object-oriented approach. For instance, it is 
possible to simulate half of the description at fixed 
point level, while the other half is still in floating 
point . 

• The use of a single object oriented programming language 
framework in OCAPI allows fast design iteration, which is 
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not possible in the typical nowadays hybrid approach. 

Comparing to existing data-flow-based systems 
like SPW and COSSAP we see that the algorithm iterations 
can be freely chosen. Comparing to existing hardware design 
environments like DC or BC, we see that we can start from a 
specification level that is more abstract than the 
connection of blocks. 

Two concepts of scaleable parallelism and 
expectation based simulation are introduced. The designer 
is given an environment to check the feasibility of what 
the designer thinks that can be done. In the development 
process, the designer creates his library of Signal 
FlowGraph (SFG) versions of abstract MATLAB operations. 

Pescriptipn Qf OCAPT, a preferred ernhnrtim^n t of thf^ r ^op^nl- 
invenfcion 

OCAPI is a C++ library intended for the 
design of digital systems. It provides a short path from a 
system design description to implementation in hardware. 
The library is suited for a variety of design tasks, 
including: 

• Fixed Point Simulations 

• System Performance Estimation 

• System Profiling 

• Algorithm- to-Architecture Mapping 

• System Design according to a Dataflow Paradigm 

• Verification and Testbench Development 



Development flow 

The flow layout 
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The design flow according to an embodiment of 
the present invention, as 'Shown in figure ID, starts off 
with an untimed, floating point C++ system description 101. 
Since data-processing intensive applications such as all- 
5 digital transceivers are targeted, this description uses 
data-flow semantics. The system is described as a network 
of communicating components. 

At first, the design is refined, and in each 
component, features expressing hardware implementation are 

10 introduced, including time (clock cycles) and bittrue 
rounding effects. The use of C++ allows to express this in 
an elegant way. Also, all refinement is done in a single 
environment, which greatly speedups the design effort. 

Next, the timed, bittrue C++ description 103 

15 is translated into an equivalent HDL description by code 
generation. For each component, a controller description 
105 and a datapath description 107 can be generated. Also 
for each component a single HDL description can be 
generated, this description preferably jointly representing 

20 the control processing and data processing of the 
component. This is done because OCAPI relies on separate 
synthesis tools for both parts, each one optimized towards 
controller or else datapath synthesis tasks. Through the 
use of an appropriate object modeling hierarchy the 

25 generation of datapath and controller HDL can be done fully 
automatic . 

For datapath synthesis 109, OCAPI relies on 
the Cathedral -3 datapath synthesis tools, that allow to 
obtain a bitparallel hardware implementation starting from 
30 a set of signal flowgraphs. Controller synthesis 111 on the 
other hand is done by the logic synthesis of Synopsys DC. 
This divide and conquer strategy towards synthesis allows 
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each tool to be applied at the right place. 

During system simulation, the system stimuli 
113 are also translated into testbenches that allow to 
verify the synthesis result of each component. After 
interconnecting all synthesized components into the system 
netlist, the final implementation can also be verified 
using a generated system testbench 115. 

The system model 

The system machine model that is used is a 
set of concurrent processes, Each process translates to one 
component in the final system implementation. 

At the system level, processes execute using 
data flow simulation semantics. That is, a process is 
described as an iterative behavior, where inputs are read 
in at the start of an iteration, and outputs are produced 
at the end. Process execution can start as soon as the 
required input values are available. 

Inside of each process, two types of 
description are possible. The first one is an untimed 
description, and can be expressed using any C++ constructs 
available. A firing rule is also added to allow dataflow 
simulation. Untimed processes are not subject to hardware 
implementation but are needed to express the overal system 
behavior. A typical example is a channel model used to 
simulate a digital transceiver. 

The second flavor of processes is timed. 
These processes operate synchronously to the system clock. 
One iteration of such a process corresponds to one clock 
cycle of processing. Such a process falls apart in two 
pieces: a control description and a data processing 



18 

description. 

The control description is done by means of a 
finite state machine, while the data description is a set 
of instructions. Each instruction consists of a series of 
5 signal assignments, and can also define process in- and 
outputs. Upon execution, the control description is 
evaluated to select one or more instructions for execution. 
Next, the selected instructions are executed. Each 
instruction thus corresponds to one clock cycle of RT 
10 behavior. 

For system simulation, two schedulers are 
available. A dataflow scheduler is used to simulate a 
system that contains only untimed blocks. This scheduler 
repeatedly checks process firing rules, selecting processes 
15 for execution as their inputs are available. When the 
system also contains timed blocks however, a cycle 
scheduler is used. The cycle scheduler manages to 
interleave execution of multi -cycle descriptions, but can 
incorporate untimed blocks as well. 

20 

The standard program 

The library of OCAPI has been developed with 
the g++ C++ GNU compiler. The best mode embodiment uses the 

25 g++ 2.8.1 compiler, and has been successfully compiled and 
run under the HPUX 10 (HPUXIO) operating system platform. 
It is also possible to use a g++ 2.7.2 compiler, allowing 
for compilation and run under operating system platforms 
such as HPUX-9 (HPRISC) , HPUX-10 (HPUXIO), SunOS (SUN4) , 

30 Solaris (SUNS) and Linux 2.0.0 (LINUX). 

The layout of the 'standard' g++ OCAPI 
program will be explained, including compilation and 
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linking of this program. 

First of all, g++ is a preferred standard 
compilation environment. On Linux, this is already the case 
5 after installation. Other operating system vendors however 
usually have their own proprietary C++ compiler. In such 
cases, the g++ compiler should be installed on the 
operating system, and the PATH variable adapted such that 
the shell can access the compiler. 
10 The OCAPI library comes as a set of include 

files and a binary lib. All of these are put into one 
directory, which is called the BASE directory. 

The 'standard program' is the minimal 
contents of an OCAPI program. It has the following layout. 

15 

include "qlib.h' * 



int mainO 

{ 

20 // your program goes here 

} 



The include "qlib.h" includes everything you 
25 need to access all classes within OCAPI. 

If this program is called "standard, cxx" , then the 
following makefile will transform the source code into an 
executable for you: 



HOSTTYPE = HPUXIO 

BASE = /imec/vsdTn/OCAPl/release/vO.9 
CC = g++ 

QFLAGS = -c -g -Wall -I${BASE} 
LIBS = -Im 

% . O : % . CXX 

$(CC) $ (QFLAGS) $< -o $@ 

TARGET = standard 

all: $ (TARGET) 

define Inkqlib 

${CC) -o $@ $(LIBS) 

endef 

OBJS = standard. o 

standard :$ {OB JS} $ (BASE) /lib$ (HOSTTYPE) qlib . a 
${ Inkqlib} 

clean: 
rm -f *.o $ (TARGET) 



This is a makefile for GNU's "make"; other "make" programs 
can have a slightly different syntax, especially for the 
definition of the "Inkqlib" macro. It is not the shortest 
possible solution for a makefile , but it is one that works 
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on different platforms without making assumptions about 
standard compilation rules. • 



The compilation flags "QFLAGS" mean the following: "-c" 
5 selects compilation-only, "-g" turns on debugging 
information, and "-Wall" is the warning flag. The 
debugging flag allows you to debug your program with "gdb" , 
the GNU debugger. 



10 Even if you don't like a debugger and prefer "printfO" 
debugging, "gdb" can at least be of great help in the case 

3 

Q the program core dumps. Start the program under "gdb" 

(type "gdb standard" at the shell prompt) , type "run" to 
let "standard" crash again, and then type "bt" . One now 
15 see the call trace. 

Calculation 



OCAPI processes both floating point and fixed point values. 
20 In contrast to the standard C++ data types like "int" and 
"double", a "hybrid" data type class is used, that 
simulates both fixed point and floating point behavior. 

The dfix class 

25 

This class is called "dfix". The particular floating/fixed 
point behavior is selected by the class constructor. The 
standard format of this constructor is 

30 dfix a; //a floating point value 

dfix a(0.5);// a floating point value with initial value 
dfix a (0.5, 10, 8) ; 
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//a fixed point value with initial value, 

// 10 bits total word-length, 8 fractional bits 

5 A fixed point value has a maximal precision of the mantissa 
precision of a C++ ''double". On most machines, this is 53 
bits . 

A fixed point value can also select a representation, an 
10 overflow behavior, and a rounding behavior. These flags 
are, in this order, optional parameters to the ''dfix" 
constructor. They can have the following values. 

• Representation flag: "dfix::tc" for two's complement 
15 signed representation, "dfix: :ns" for unsigned 

representation. 

• Overflow flag: "dfix: :wp" for wrap-around overflow, 
"dfix::st" for saturation. 

• Rounding flag: "dfix::fl" for truncation (floor), 
20 "dfix: :rd" forrounding behavior. 

Some examples are 

dfix a(0.5, 10, 8) ; 
25 // the default is two's complement, wrap-around, 

// truncated quantisation 
dfix a(0.5, 10, 8, dfix::tc, dfixrrst, dfix: :rd) ; 

// two's complement, saturation, rounding cjuantisation 
dfix a(0.5, 10, 8, dfix: :ns) ; 
30 // unsigned, wrap-around, truncated quantisation 

When working with fixed point "dfix"es, it is important to 



23 

keep the following rule in mind: "^"quantisation occurs only 
when a value is defined or assigned" . This means that a 
large expression with several intermediate results will 
never have these intermediate values quantised. Especially 
5 when writing code for hardware implementation, this should 
be kept in mind. Also intermediate results are stored in 
finite hardware and therefore will have some quantisation 
behavior. There is however a a "cast" operator that will 
come at help here. 

10 

The dfix operators 



The operators on "dfix" are shown below 



15 • / 

Standard addition, subtraction (including 
unary minus) , multiplication and division. 

• +=, *=, /= 

In-place versions of previous operators. 
20 • abs 

Absolute value. 

• <<, >> 

Left and right shifts. 

• <<=f >>= 

25 In place left and right shifts. 

• msbpos 

Most-significant bit position. 

Sc. I, \ - 

Bitwise and, or, exor, and not operators. 
30 • fracO (member call) 

Fractional part. 

• ==, !=, <=, >=, <, > 



24 

Relational operators: equal, different, 
smaller then lOr equal to, greater then or 
equal to, smaller then, greater then. These 
return an ^'int" instead of a ^'df ix" . 

5 

All operators with exception of the bitwise operators work 
on the maximal fixed point precision (53 points) . The 
bitwise operators have a precision of 32 bits (a C++ 
"long"). Also, they assume the fixed point representation 
10 contains no fractional bits. 

In addition to the arithmetic operators, several utility 



=^ methods are available for the "dfix" class, 



15 dfix a,b; 

// cast a to another type 

b = cast(dfix(0, 12, 10), a) ; 

20 // assign b to a, retaining the quantisation of a 
a = b; 

// assign b to a, including the quantisation 
a .duplicate (b) ; 

25 

/ / return the integer part of b 
int c = (int) b; 

// retrieve the value of b as a double 
30 double d,e: 
d = b.Val () ; 
e = Val (b) ; 



25 

// return quantisation characteristics of a 
a.TypeWO; // returns the number of bits 

a.TypeLO; // returns the number of fractional bits 

5 a.TypeSignO ; // returns dfix: :tc or dfix: :ns 

a.TypeOverflowO ; // returns df ix: :wp or dfix::st 
a.TypeRoundO ; // returns dfix::fl or dfix::rd 

// check if two dfixes are identical in value and 
10 quantisation 

identical (a,b) ; 

// see wether a is floating or fixed point 

a . TypeMode ( ) ; // returns df ix: : f ixpoint or dfix: :floatpoint 
15 a. isDouble () ; 
a.isFixO ; 

// write a to cout 
cout << a; 

20 

// write a to stdout, in float format, 
// on a field of 10 characters 
write (cout, a, 'f, 10); 

25 // now use a fixed- format 
write (cout, a, 'g' , 10); 

// next assume a is a fixed point number, and write out an 
// integer representation (considering the decimal point at 
// the Isb of a) use a hexadecimal format 
30 write (cout, a, 'x', 10); 



// use a binary format 
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write (cout, a, 'b', 10); 
// use a decimal format 
write (cout, a, 'd*, 10); 

5 // read a from stdin 
cin >> a; 

Commxinication 

10 Apart from values, OCAPI is concerned with the 
communication of values in between blocks of behavior. The 
high level method of communication in OCAPI is a FIFO 
queue, of type "df bf ix" . This queue is conceptually 
infinite in length. In practice it is bounded by a sysop 

15 phonecall telling that you have wasted up all the swap 
space of the system. 

The dfbfix class 

20 A queue is declared as 

dfbfix a(""a' •) ; 

This creates a queue with name a. The queue is intented to 
25 pass value objects of the type "df ix" . There is also an 
alias type of "dfbfix", known as "FB" (flow buffer). So you 
can also write 



FB a(^^a* ' ) ; 



30 



The dfbfix operations 
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The basic operations on a queue allow to store and retrieve 
"dfix" objects. The operations are 

dfix k; 
5 dfix j (0.5) ; 

dfbfix a(""a' ' ) ; 

// insert j at the front of a 
a. put ( j ) ; 

10 // operator format for an insert 
a << j ; 

// insert j at position 5, with position 0 corresponding to 
// the front of a. 
15 a.putlndex( j , 5) ; 

// read one element from the back of a 
k = a. get () ; 

20 // operator format for a read 
a » j ; 

// peek one element at position 1 of a 
k = a.getlndex(l) ; 

25 

// operator format for peek 
k = a[l] ; 

// retrieve one element from a and throw it 
30 a.popO; 

// throw all elements, if any, from a 
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a . clear ( ) ; 

// return the number of elements in a as an int 
int n = a.getSizeO; 

5 

// return the name of the queue 
char *p = a. name 0; 

Whenever you perform an access operation that reads past 
10 the end of a FIFO, a runtime error results, showing 

Queue Underflow ® get in queue a 

Utility calls for dfbfix 

15 

Besides the basic operations on queues, there are some 
additional utiliy operations that modify a queue behavior 

// make a queue of length 20.. The default length of a queue 
20 // is 16. Whenever this length is exceeded by a put, the 
// storage in the queue is dynamically expanded by a factor 
// of 2. 

dfbfix aC^^a' ' , 20) ; 

25 // After the asTypeO call, the queue will have an input 
//""quantizer*' that will quantize each element inserted 
// into the queue to that of the quantizer type 
df ix q(0, 10, 8) ; 
a.asType (q) ; 

30 // After an asDebugO call, the queue is associated with a 
// file, that will collect every value written into the 
// queue. The file is opened as the queue is initialized 
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// and closed when the queue object is destroyed. 
a.asDebugCthisfile.dat* ') ; 

// Next makes a duplicate queue of a, called b. Every write 
// into a will also be done on b. Each queue is allowed to 
5 // have at most ONE duplicate queue, 
dfbfix b(^^b") ; 

a. asDup(b) ; 

// Thus, when another duplicate is needed, you write is as 
10 dfbfix c(""c' ') ; 

b. asDup(c) ; 

During the communication of "dfix" objects, the queues keep 
track of some statistics on the values that are passed 
15 through it. You' can use the operator and the member 

function "stattitle ( ) " to make these statistics visible. 

The next program demonstrates these statistics 

20 #include "qlib.h" 



void mainO 



{ 



dfbfix a ("a") ; 



25 



a << dfix(2) ; 



a << dfix(l) ; 



a << dfix(3) ; 



30 



a. stattitle (cout) ; 
cout << a; 

} 



30 

When running this program, the following appears on screen 

Name put get MinVal ®idx MaxVal ®idx Max# ®idx 

A 3 0 l.OOOOe+00 2 3.0000e+00 3 3 3 



The first line is printed by the ''stattitle () " call as a 
5 mnemonic for the fields printed below. The next line is the 
result of passing the queue to the standard output stream 
object. The fields mean the following: 



Name The name of the queue 

put The total number of elements "putO" into the 

queue 

get The total number of elements "getO" from the 

queue 

MinVal The lowest element put onto the queue 

@idx The put sequential number that passed this 

lowest element 
MaxVal The highest element put onto the queue 
@idx The put sequential number that passed this 

highest element 
Max# The maximal queue length that occurred 

®idx The put sequential number that resulted ion 

this maximal queue length 



Global s and derivatives for dfbfix 

25 

There are two special derivates of "dfbfix" , Both are 
derived classes such that you can use them wherever you 
would use a "dfbfix" . Only the first will be discussed 
here, the other one is related to cycle- true simulation and 



is discussed in section "Faster Communications". 

The "dfbfix_nil" object is like a "/dev/null" drain. Every 
Mfix" written into this queue is thrown. A read operation 
from such a queue results in a runtime error. 

There are two global variables related to queues. The 
"listOfFB" is a pointer to a list of queues, containing 
every queue object you have declared in your program. The 
member function call "nextPBO" will return the successor 
of the queue in the global list. For example, the code 
snippet 

dfbfix *r; 

for ( r = listOfFB ; r ; r = r->nextFB() ) 
{ 

} 

will walk trough all the queues present in the OCAPI 
program. 

The other global variable is "nilFB", which is of the type 
"dfbfix_nil" . It is intended to be used as a global 
trashcan. 

The basic block 

OCAPI supports the dataflow simulation paradigm. In order 
to define the actors to the system, one "base" class is 
used, from which all actors will inherit. In order to do 
untimed simulations, one should follow a standard template 
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to which new actor classes must conform. In this section, 
the standard template will.be introduced, and the writing 
style is documented. 

Basic block include and code file 

Each new actor in the system is defined with one header 
file and one source code C++ file. We define a standard 
block, "add", which performs an addition. 

The include file, "add.h", looks like 

#ifndef ADD_H 
#define ADD_H 

#include ""qlib.h' ' 

class add : public base 

{ 

public : 

add (char *name, FB & _inl, FB & _in2, FB & _ol) ; 
int run ( ) ; 
private : 

FB *inl; 
FB *in2; 
FB *ol; 

}; 

#endif 

This defines a class "add", that inherits from "base". The 
"base" object is the one that OCAPI likes to work with, so 



you must inherit from it in order to obtain an OCAPI basic 
block. 

The private members in the block are pointers to 
communication queues. Optionally, the private members 
should also contain state, for example the tap values in a 
filter. The management of state for untimed blocks is 
entirely the responsibility of the user; as far as OCAPI is 
concerned, it does not care what you use as extra 
variables , 

The public members include a constructor and an execution 
call "run". The constructor must at least contain a name, 
and a list of the queues that are used for communication. 
Optionally, some parameters can be passed, for instance in 
case of parametrized blocks (filters with a variable number 
of taps and the like) . 

The contents of the adder block will be described in 
"add.cxx" . 

#include ""add.cxx' * 

add: :add (char *name, FB & _inl, FB & _in2, FB & _ol) : 
base (name) 

{ 

inl = _inl.asSource (this) ; 
in2 = _in2 .asSource (this) ; 
ol = _ol.asSink (this) ; 

} 



int add: : run () 
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{ 

// firing rule 

if (inl->getSize 0 < 1) 

return 0; 
if (in2->getSize 0 < 1) 

return 0; 

ol->put (inl->get 0 + in2->get()); 
return 1; 

} 

The constructor passes the name of the object to the "base" 
class it inherits from. In addition, it initializes private 
members with the other parameters. In this example, the 
communication queue pointers are initialized. This is not 
done through simple pointer assignment, but through 
function calls "asSource" and "asSink" . This is not 
obligatory, but allows OCAPI to analyze the connectity in 
between the basic blocks . Since a queue is intended for 
point-to-point communication, it is an error to use a queue 
as input or ouput more then once. The function calls 
"asSource" and "asSink" keep track of which blocks 
source/sink which queues. They will return a runtime error 
in case a queue is sourced or sinked more then once. The 
constructor can optionally also be used to perform 
initialization of other private data (state for instance) , 
The "runO" method contains the operations to be performed 
when the block is invoked. The behavior is described in an 
iterative way. The "run" function must return an integer 
value, 1 if the block succeeded in performing the 
operation, and 0 if this has failed. 



This behavior consists of two parts: a firing rule and an 
operative part. The firing rule must check for the 
availability of data on the input queues. When no 
sufficient data is present (checked with the "getSizeO" 
member call), it stops execution and returns 0. When 
sufficient data is present, execution can start- Execution 
of an untimed behavior can use the different C++ control 
constructs available. In this example, the contents of the 
two input queues is read, the result is added and put into 
the ouput queue. After execution, the value 1 is returned 
to signal the behavior has completed. 

Predefined standard blocks: file sources and sinks 

The OCAPI library contains three predefined standard 
blocks, which is a file source "src" , a file sink "snk" , 
and a ram storage block "ram" . 

The file sources and sinks define operating system 
interfaces and allow you to bring file data into an OCAPI 
simulation, and to write out resulting data to a file. The 
examples below show various declarations of these blocks. 
Data in these files is formatted as floating point numbers 
separated by white space. For output, newlines are used as 
whitespace . 

// define a file source block, with name a, that will read 
// data from the file" in . dat ' ' and put it into the queue k 

dfbfix k(""k' •) ; 
srca(""a»', k, ""in.dat'*); 

// an alternative definition is 



dfbfix k(^"k' ' ) ; 
src a Ca ' ' , k) ; 

a.setAttr (src: : FILENAME, • "in.dat ' • ) ; 

5 // which also gives you a complex version 
dfbfix kK^^kl' •) ; 
dfbfix k2 (""k2' ') ; 
src a(""a' \ kl, k2) ; 

a. setAttr (src: : FILENAME, ' 'in.dat ' ' ) ; 

10 

// define a sink block b, that will put data from queue o 

// into a file ""out.dat''. 

dfbfix o(^^o' •) ; 

snk b(^"b'', o, " "out .dat • ' ) ; 

15 

// an alternative definition is 
dfbfix oC^'^o' • ) ; 
snk b(""b» • , o) ; 

b. setAttr (snk: : FILENAME, ""out .da t' ' ) ; 

20 

// which gives one also a complex version 
dfbfix ol (""ol' ' ) ; 
dfbfix o2(""o2' •) ; 
snk b(""b* ' , ol, o2) ; 
25 b.setAttr (snk: : FILENAME, ""out .dat • • ) ; 

// the snk mode has also a matlab-goodie which will format 
// output data into a matrix A that can be read in directly 
// by Mat lab. 
30 dfbfix o(""o' ' ) ; 

snk b(""b*', o, ""out.m"'); 
b.setAttr (snk: : FILENAME, ""out .m' • ) ; 
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b.setAttr (snk: : MATLABMODE , 1) ; 

Predefined standard blocks: RAM 

5 The ram untimed block is intended to simulate single-port 
storage blocks at high level. By necessity, some 
interconnect assumptions had to be made on this block. On 
the other hand, it is supported all the way through code 
generation. 

10 

OCAPI does not generate RAM cells. However, it will 
generate appropriate connections in the resulting system 
net list, onto which a RAM cell can be connected. 

15 The declaration of a ram block is as follows. 

// make a ram a, with an address bus, a data input bus, a 
// data output bus, a read command line, a write command 
// line, with 64 locations 

20 

df bf ix address ( " "address • ' ) ; 
df bf ix data-in ( " "data_in ■ • ) ; 
df bf ix data_out ( " "data_out ' ' ) ; 
df bf ix readme ( " "read_c * ' ) ; 
25 df bf ix write_c ( " "write_c • ' ) ; 

ram aCa' *, address, data_in, data-out, write_c,read_c, 64) ; 

// clear the ram 
30 a. clear 0 ; 



// fill the ram with the linear sequence data = kl+address 
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// * k2; 
a.filKkl, k2); 

// dump the contents of a to cout 
5 a , show ( ) ; 

The execution semantics of the ram are as follows. For each 
read or write, an address, a read command and a write 
command must be presented. If the read command equals 

10 "dfix(l)", a read will be performed, and the value stored 
at the location presented through "address" will be put on 
"data_out" . If the read command equals any other value, a 
dummy byte will be presented at "data_out" . If no read 
command was presented, no data will be presented on 

15 "data_out"- For writes, an identical story holds for reads 
on the "data_in" input : whenever a write command is 
presented, the data input will be consumed. When the write 
command equals 1, then the data input will be stored in the 
location provided through "address" . When a read and write 

20 command are given at the same time, then the read will be 
performed before the write. The ram also includes an online 
"purifier" that will generate a warning message whenever 
data from an unwritten location is read. 

25 Untlmed slxnulations 

Given the descriptions of one or more untimed blocks, a 
simulation can be done. The description of a simulation 
requires the following to be included in a standard C++ 
3 0 "main ( ) " procedure : 

• The instantiation of one or more basic blocks. 
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• The instantiation of one or more communication queues 
that interconnect the blocks 

•The setup of stimuli. Either these can be included at 
runtime by means of the standard file source blocks, or 
5 else dedicated C++ code can be written that fills up a 
queue with stimuli. 

• A schedule that drives the execution methods of the basic 
blocks . 

10 A schedule, in general, is the specification of the 
sequence in which block firing rules must be tested (and 
fired if necessary) in order to run a simulation. There has 
been quite some research in determining how such a schedule 
can be constructed automatically from the interconnection 

15 network and knowledge of the block behavior. Up to now, an 
automatic mechanism for a general network with arbitrary 
blocks has not been found. Therefore, OCAPI relies on the 
designer to construct such a schedule. 

20 Layout of an un timed simulation 

In this section, the template of the standard simulation 
program will be given, along with a description of the 
"scheduler" class that will drive the simulation. A 
25 configuration with the "adder" block (described in the 
section on basic blocks) is used as an example. 

#include "qlib.h' ' 
#include ""add.h' ' 

30 



void mainO 

{ 
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dfbf ix il ("il") ; 
dfbfix i2 ("i2n) ; 
dfbfix ol ("ol") ; 

src SRC1("SRC1", il,"SRCl"); 
src SRC2("SRC2", i2,"SRC2"); 
add ADD ("ADD" , il, i2, ol) ; 
snk SNKl ( " SNKl " , ol , " SNKl " ) ; 

schedule SI ("SI") ; 
Sl.next (SRCl) ; 
SI .next (SRC2) ; 
Sl.next (ADD ) ; 
Sl.next (SNKl) ; 

while (SI .run 0 ) ; 

il . stattitle (cout) ; 
cout << il ; 
cout << i2; 
cout « ol; 

} 

The simulation above instantiates three communication 
buffers, that interconnect four basic blocks. The 
instantiation defines at the same time the interconnection 
network of the simulation. Three of the untimed blocks are 
standard file sources and sinks, provided with OCAPI . The 
"add" block is a user defined one. 

After the definition of the interconnection network, a 
schedule must be defined. A simulation schedule is 
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constructed using "schedule" objects. In the example, one 
schedule object is defined, and the four blocks are 
assigned to it by means of a ''nextO" member call. 

5 The order in which "nextO" calls are done determines the 
order in which firing rules will be tested. For each 
execution of the schedule object '*S1", the ''run()" methods 
of "SRC1'\ "ADD" and "SNKl" are called, in that 

order. The execution method of a scheduler object is called 
10 "runO". This function returns an integer, equal to one 
when at least on block in the current iteration has 

O 

executed (i.e. the "runO" of the block has returned one). 
^ When no block has executed, it returns zero. 

rl 15 The while loop in the program therefore is an execution of 

yj the simulation. Let us assume that the directory of the 

g simulator executable contains the two required stimuli 

^ files, "SRCl" and "SRC2". Their contents is as follows 

SRC2 not present in the file 

-- not present in the file 

4 
5 
6 

25 

When compiling and running this program, the simulator 
responds : 



20 SRCl 

1 
2 
3 



*** INFO: Defining block SRCl 

30 *** INFO: Defining block SRC2 

*** INFO: Defining block ADD 

*** INFO: Defining block SNKl 



m 



42 

Name put get MinVal @idx MaxVal ®idx Max# ©idx 

11 3 3 l,0000e+00 . 1 3.0000e+00 3 11 

12 3 3 4.0000e+00 1 6.0000e+00 3 11 
ol 3 3 5.0000e+00 1 9.0000e+00 3 1 1 

and in addition has created a file "SNKl" , containing 

SNKl not present in the file 

5 not present in the file 

5,000000e+00 
7.000000e+00 
9.000000e+00 

10 The "INFO" message appearing on standard output are a side 
effect of creating a basic block. The table at the end is 
produced by the print statements at the end of the program. 



15 



More on schedules 

If you would examine closely which blocks are fired in 
which iteration, (for instance with a debugger) then you 
would find 



20 iteration 1 

run SRCl => il contains 1.0 
run SRC2 => i2 contains 4.0 
run ADD => ol contains 5.0 
run SNKl => write out ol 
25 schedule. run 0 returns 1 
iteration 2 

run SRCl => il contains 2.0 
run SRC2 => 12 contains 5.0 




43 

=> ol contains 7.0 
=> write out ol 
1 

=> il contains 3.0 
=> i2 contains 6.0 
=> ol contains 9.0 
=> write out ol 
1 

=> at end-of-file, fails 
=> at end-of-file, fails 
=> no input tokens, fails 
=> no input tokens, fails 
0 => end simulation 

There are two schedule member functions, "traceOnO" and 
"traceOf f 0 " / that will produce similar information for 
you. If you insert 

20 

S, traceOnO ; 

just before the while loop, then you see 

25 *** INFO: Defining block SRCl 

*** INFO: Defining block SRC2 

*** INFO: Defining block ADD 

*** INFO: Defining block SNKl 

SI [ SRCl SRC2 ADD SNKl ] 
30 SI [ SRCl SRC2 ADD SNKl ] 

SI ( SRCl SRC2 ADD SNKl ] . 

SI [ ] 



run ADD 
run SNKl 

schedule . run ( ) returns 

iteration 3 
5 run SRCl 

run SRC2 
run ADD 
run SNKl 

schedule . run ( ) returns 
10 iteration 4 

run SRCl 
run SRC2 
run ADD 
run SNKl 

15 schedule. run 0 returns 
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Name put get MinVal ®idx MaxVal @idx Maxft ®idx 

11 3 3 l.OOOOe+00 . 1 3.0000e+00 3 11 

12 3 3 4.00008+00 1 6.0000e+00 3 11 
ol 3 3 5.0000e+00 1 9.0000e+00 3 11 

appearing on the screen. This trace feature is convenient 
during schedule debugging. 

5 In the simulation ouput, you can also notice that the 
maximum number of tokens in the queues never exceeds one. 
When you had entered another schedule sequence, for example 



schedule SI ("SI") ; 
10 SI, next (ADD ) 

SI. next (SRC2) 
SI. next (SRCl) 
Sl.next (SNKl) 



15 then you would notice that the maximum number of tokens on 
the queues would result in different figures. On the other 
hand, the resulting data file, "SNKl", will contain exactly 
the same results. This demonstrates one important property 
of dataflow simulations: any arbitrary but consistent 

20 schedule yields the same results. Only the required amount 
of storage will change from schedule to schedule. 



In mult irate systems, it is convenient to have different 
schedule objects and group all blocks working on the same 
25 rate in one schedule. 



Profiling in untimed simulations 
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Untimed simulations are not targeted to circuit 
implementation. Rather, they have an explorative character. 
Besides the queue statistics, OCAPI also enables you to do 
precise profiling of operations. The requirement for this 
5 feature is that 



• You use "schedule" objects to construct the simulation 

• You describe block behavior with "dfix" objects 



01 



10 Profiling is by default enabled. To view profiling results, 
you send the schedule object under consideration to the 
standard output stream. In the "main" example program given 
above, you can modify this as 

15 include ""qlib.h*' 
include ""add.h'' 

void main ( ) 
{ 



20 



25 



schedule SI ("SI") ; 



cout << SI; 



When running the simulation, you will see the following 
appearing on stdout : 



*** INFO: Defining block SRCl 

30 *** INFO: Defining block SRC2 

*** INFO: Defining block ADD 

**♦ INFO: Defining block SNKl 



Name put get 

11 3 3 

12 3 3 
ol 3 3 



MinVal 
1 .OOOOe+00 
4 .OOOOe+00 
5. OOOOe+00 



46 

@idx MaxVal 

1 3. OOOOe+00 

1 6. OOOOe+00 

1 9.0000e+00 



®idx Max# ®idx 

3 11 

3 11 

3 11 



Schedule SI ran 4 times: 
SRCl 3 
SRC2 3 
ADD 3 
+ 3 
SNKl 3 



For each schedule, it is reported how many times it was 
10 run. Inside each schedule, a firing count of each block is 
given. Inside each block, an operation execution count is 
given. The simple "add" block gives the rather trivial 
result that there were three additions done during the 
simulation. 

15 

The gain in using operation profiling is to estimate the 
computational rec[uirement for each block. For instance, if 
you find that you need to do 23 multiplications in a block 
that was fired 5 times, then you would need at least five 
20 multipliers to guarantee the block implementation will need 
only one cycle to execute. 

Finally, if you want to suppress operation profiling for 
some blocks, then you can use the member function call 
"noOpsCntO" for each block. For instance, writing 

25 

ADD . noOpsCnt ( ) ; 



suppresses operation profiling in the ADD block. 
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Implementation 

The features presented in the previous sections contain 
5 everything you need to do untimed, high level simulations. 
These kind of simulations are useful for initial 
development. For real implementation, more detail has to be 
added to the descriptions. 

10 OCAPI makes few assumptions on the target architecture of 
your system. One is that you target bitparallel and 
synchronous hardware. Synchronicity is not a basic 
requirement for OCAPI . The current version however 
constructs single- thread simulations, and also assumes that 

15 all hardware runs at the same clock. If different clocks 
need to be implemented, then a change to the clock-cycle 
true simulation algorithm will have to be made. Also, it is 
assumed that one basic block will eventually be implemented 
into one processor. 

20 

One question that comes to mind is how hardware sharing 
between different basic blocks can be expressed. The answer 
is that you will have to construct a basic block that 
merges the two behaviors of two other blocks. Some 

25 designers might feel reluctant to do this. On the other 
hand, if you have to write down merged behavior, you will 
also have to think about the control problems that are 
induced from doing this merging. OCAPI will not solve this 
problem for you, though it will provide you with the means 

30 to express it. 

Before code generation will translate a description to an 
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HDL, one will have to take care of the following tasks: 

• One will have to specify wordlengths . The target 
hardware is capable of doing bitparallel, fixed point 
operations, but not of doing floating point operations. 
One of the design tasks is to perform the quantisation 
on floating point numbers. The ''dfix" class discussed 
earlier contains the mechanisms for expressing fixed 
point behavior. 

• One will have to construct a clock-cycle true 
description. In constructing this description, one will 
not have to allocate actual hardware, but rather express 
which operations one expects to be performed in which 
clock cycle. The semantical model for describing this 
clock cycle true behavior consists of a finite state 
machine, and a set of signal flow graphs. Each signal 
flow graph expresses one cycle of implemented behavior. 
This style of description splits the control operations 
from data operations in your program. In contrast, the 
untimed description you have used before has a common 
representation of control and data. 

OCAPI does not force an ordening on these tasks . For 
instance, one might first develop a clock cycle true 
25 description on floating point numbers, and afterwards 
tackle the quantization issues. This eases verification of 
the clock-cycle true circuit to the untimed high level 
simulation. 

30 The final implementation also assumes that all 
communication queues will be implemented as wiring. They 
will contain no storage, nor they will be subject to buffer 
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15 



20 
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synthesis. In a dataflow simulation, initial buffering 
values can however be necessary (for instance in the 
presence of feedback loops) . In OCAPI, such a buffer must 
be implemented as an additional processor that incorporates 
5 the required storage. The resulting system dataflow will 
become deadlocked because of this. The cycle scheduler 
however, that simulates timed descriptions, is clever 
enough to look for these 'initial tokens* inside of the 
descriptions . 

10 

In the next sections, the classes that allow you to express 
clock cycle true behavior are introduced. 

Signals and signal flowgraphs 

15 

Some initial considerations on signals are introduced 
first . 

Hardware versus Software 

20 

Software programs always use memory to store variables. In 
contrast, hardware programs work with signals, which might 
or might not be stored into a register. This feature can be 
expressed in OCAPI by using the "_sig" class. Simply 
25 speaking, a "_sig" is a "dfix" for which one has indicated 
whether is needs storage or not . 

In implementation, a signal with storage is mapped to a net 
driven by a register, while an immediate signal is mapped 
30 to a net driven by an operator. 



Besides the storage issue, a signal also departs from the 
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concept of **scope" one uses in a program. For instance, in 
a function one can use local variables, which are destroyed 
(i.e. for which the storage is reclaimed) after one has 
executed the function. In hardware however, one controls 
5 the signal-to-net mapping by means of the clock signal. 

Therefore one have to manage the scope of signals. The 
signal scope is expressed by using a signal flowgraph 
object, "sfg" , A signal flowgraph marks a boundary on 
10 hardware behavior, and will allow subsequent synthesis 
tools to find out operator allocation, hardware sharing and 
signal -to-net mapping . 

The _sig class and related operations 

15 

Hardware signals can expressed in three flavors. They can 
be plain signals, constant signals, or registered signals. 
The following example shows how these three can be defined. 

20 // define a plain signal a, with a floating point dfix 
// inside of it. 
_sig a ( " "a ' ' ) ; 

// define a plain signal b, with a fixed point dfix inside 
25 // of it. 

_sigb(""b'*, dfix(0,10,8)); 

// define a registered signal c, with an initial value k 
// and attached to a clock ck. 
30 dfix k(0.5) ; 
elk ck; 

_sig cCc'*, ck, k) ; 
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// define a constant signal d, equal to the value k 
_sig d(k) ; 

The registered signals, and more in particular the clock 
object, are explained more into detail when signal 
flowgraphs and finite state machines are discussed. This 
section concentrates on operations that are available for 
signals . 

Using signals and signal operations, one can construct 
expressions. The signal operations are a subset of the 
operations on "df ix" . This is because there is a hardware 
operator implementation behind each of these operations. 

• +,-,* 

Standard addition, subtraction (including unary minus) , 
multiplication 

• &/ h ^/ - 

Bitwise and, or, exor, and not operators 

• ==, !=, <=, >=, <, > 
Relational operators 

• <<, >> 

Left and right shifts 

• s . cassign (si , s2 ) 

Conditional assignment with si or s2 depending on s 

• cast(T,s) 

Convert the type of s to the type expressed in ''dfix" T 

• lu(L,s) 

Use s as in index into lookuptable L and retrieve 

• msbpos(s) 
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Return the position of the msb in s 

Precision considerations are the same as for ''df ix" . That 
is, precision is at most the mantissa precision of a double 
5 (53 bits) . For the bitwise operations, 32 bits are assumed 
(a long), '^cast", "lu" and "msbpos" are not member but 
friend functions. In addition, "msbpos" expects fixed-point 
signals . 

10 _sig a("^a' ') ; 
_sig b(""b' ' ) ; 
_sig c ( " "c * • ) ; 

// some simple operations 
15 c = a + b; 
c = a - b; 
c = a * b; 

// bitwise operations works only on fixed point signals 

20 _sig e(dfix(Oxff, 10, 0) ) ; 
_sig d(""d' ' ,dfix(0,10,0) ) ; 
_sig f(""f•^dfix(0,10,0)); 
f = d & e; 
f = d I e; 

25 f = «d; 

f = d ^ _sig(dfix(3,10,0)) ; 

// shifting 

// a dfix is automatically promoted to a constant _sig 
30 f = d « dfix(3,8,0) ; 



// conditional assignment 
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f = (d < df ix(2 , 10, 0) ) -cassign (e, d) ; 

// type conversion is done with cast 
_sig g(""g' ' ,dfix (0,3,0) ) ; 
5 g = cast (dfix(0, 3, 0) , d) ; 

// a lookup table is an array of unsigned long 
unsigned long j = {l, 2, 3, 4, 5); 
// a lookuptable with 5 elements, 3 bits wide 
10 lookupTable j_lookup(""j_lookup' ' , 5, dfix (0,3,0)) = j ; 
// find element 2 
g = lu(j_lookup, df ix(2,3, 0) ) ; 

If one is interested in simulation only, then one should 
15 not worry too much about type casting and the like. 
However, if one intends implementation, then some rules are 
at hand. These rules are induced by the hardware synthesis 
tools. If one fails to obey them, then one will get a 
runtime error during hardware synthesis. 

20 

• All operators, apart from multiplication, return a 
signal with the same wordlength as the input signal. 

• Multiplication returns a wordlength that is the sum of 
the input wordlengths. 

25 • Addition, subtraction, bitwise operations, comparisons 
and conditional assignment recjuire the two input 
operands to have the same wordlength. 

Some common pitfalls that result of this restriction are 
30 the following. 

• Intermediate results will, by default, not expand 
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wordlength. In contrast, operations on dfix do not loose 
precision on intermediate results. For example, shifting 
an 8 bit signal up 8 positions will return you the value 
of zero, on 8 bits. If you want too keep up the 
precision, then you must first cast the operation to the 
desired output wordlength, before doing the shift. 
• The multiplication operator increases the wordlength, 
which is not automatically reduced when you assign the 
result to a signal of smaller with. If you want to 
reduce wordlength, then you must do this by using a cast 
operation. 

For complex expressions, these type promotion rules look a 
bit tedious. They are however used because they allow you 
to express behavior precisely downto the bit level. For 
example, the following piece of code extracts each of the 
bits of a three bit signal : 

_sig threebits(dfix(6,3, 0) ) ; 

dfix bit (0,1,0); 

_sig bit2("^bit2« •) , bitl( " "bitl ' ' ) , bitO ( ^ "bitO • ' ) ; 

bit2 = cast (bit, threebits » dfix(2)); 
bitl = cast (bit, threebits >> dfix(l)); 
bitO = cast (bit, threebits); 

These bit manipulations were not possible without the given 
type promotion rules. 

For hardware implementation, the following operators are 
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present . 

• Addition and subtraction are implemented on ripple -carry 

adder/ subtracters . 
5 • Multiplication is implemented with a booth multiplier 
block. 

• Casts are hardwired. 

• Shifts are either hardwired in case of constant shifts, 
or else a barrel shifter is used in case of variable 

10 shifts. 

• Comparisons are implemented with dedicated comparators 

(in case of constant comparisons) , or subtractions (in 
case of variable comparisons) . 

• Bitwise operators are implemented by their direct gate 
15 equivalent at the bit level. 

• Lookup tables are implemented as PLA blocks that are 
mapped using two- level or multi -level random logic. 

• Conditional assignment is done using multiplexers. 

• Msbit detection is done using a dedicated msbit- 
20 detector. 

Globals and utility functions for signals 

There are a number of global variables that directly relate 
25 to the **_sig" class, as well as the embedded "sig" class. 
In normal circumstances, you do not need to use these 
functions . 

The variables "glbNumberOf_Sig" and "glbNumberOf Sig" 
30 contain the number of "_sig" and "sig" that your program 
has defined. The variable "^glbNumberOf Reg" contains the 
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number of "sig" that are of the register type. This 
represents the word-level • register count of your design. 
The "glbSigHashConf licts" contain the number of hash 
conflicts that are present in the internal signal data 
structure organization. If this number is more then, say 5% 
of "glbNumberOf^Sig" , then you might consider knocking at 
OCTVPIs complaint counter. The simulation is not bad if you 
exceed this bound, only it will go slower. 

The variable ^^glbListOf Sig" contains a global list of 
signals in your system. You can go through it by means of 

sig *run; 

for (run = glbListOf Sig; run; run = run->nextsig () ) 
{ 

} 

For each such a "sig", you can access a number of utility 
member functions. 

• ''isregister 0 " returns 1 when a signal is a register. 

• "isconstant 0 " returns 1 when a signal is a constant 
value . 

• "istermO" returns 1 when you have defined this signal 
yourself. These are signals which are introduced through 
"_sig()" class constructors. OCAPI however also adds 
signals of its own. 

• "getnameO" returns the "char *" name you have used to 
define the signal. 

• "get_showname() " returns the "char *" name of the signal 
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that is used for code generation. This is equal to the 
original name, but with a unique suffix appended to it. 



The sfg class 

5 

In order to construct a timed (clocked) simulation, signals 
and signals expressions must be assigned to a signal 
flowgraph, A signal flowgraph (in the context of OCAPI) is 
a container that collects all behavior that must be 
10 executed during one clock cycle. 

O 

-J The sfg behavior contains 

hi •A set of expressions using signals 

- - I 

ffi 15 • A set of inputs and outputs that relate signals to 

s . s 

^ output and input queues 

Thus, a signal flowgraph object connects local behavior 
^ (the signals) to the system through communications queues. 

20 In hardware, the indication of input and output signals 
also results in ports on your resulting circuit . 



A signal flowgraph can be a marker of hardware scope. This 
is also demonstrated by the following example . 

_sig a (""a* ' ) ; 
_sig b(""b' ') ; 
_sig c (df ix (2) ) ; 



30 dfbf ix A(" "A» • ) ; 
dfbfix B(""B' • ) ; 
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// a signal f lowgraph. object is created 
sfg add_two, add_three; 

// from now on, every signal expression written down will 
5 // be included in the signal flowgraph add_two 
add_two . starts ( ) ; 
a = b + c; 

// You must also give a name to add_two, for code 
10 // generation 

add_two << ""add_two'*; 

// also, inputs and ouputs have to be indicated. 
// you use the input and ouput objects ip and op for this 
15 add_two << ip(b, B) ; 
add_two << op (a, A) ; 

// next expression will be part of add_three 
add_three . starts ( ) ; 
20 a = b + dfix(3) ; 

add_three << " "add_three • • ; 
add_three << ip(b,B); 
add_three << op(a,A); 

25 

// you can also to semantical checks on signal flowgraphs 
add_two . check ( ) ; 
add_three . check ( ) ; 

The semantical check warns you for the following 
30 specification errors: 

• Your signal flowgraph contains a signal which is not 




59 

declared as a signal flowgraph input and at the same 
time, it is not a constant or a register. In other 
words, your signal flowgraph has a dangling input. 
• You have written down a combinatorial loop in your 
5 signal flowgraph. Each signal must be ultimately 

dependent on registered signals, constants, or signal 
flowgraph inputs. If any other dependency exists, you 
have written down a combinatorial loop for which 
hardware synthesis is not possible. 

10 

Execution of a signal flowgraph 



A signal flowgraph defines one clock cycle of behavior. The 
semantics of a signal flowgraph execution are well defined. 

15 

• At the start of an execution, all input signals are 
defined with data fetched from input queues. 

• The signal flowgraph output signals are evaluated in a 
demand driven way. That is, if they are defined by an 

20 expression that has signal operands with known values, 

then the ouput signal is evaluated. Otherwise, the 
unknown values of the operands are determined first. It 
is easily seen that this is a recursive process. Signals 
with known values are: registered signals, constant 

25 signals, and signals that have already been calculated 

in the current execution. 

• The execution ends by writing the calculated output 
values to the output queues . 



30 Signal flowgraph semantics are somewhat related to untimed 
blocks with firing rules. A signal flowgraph needs one 
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token to be present on each input queue. Only, the firing 
rule on a signal flowgraph is not implemented. If the token 
is missing, then the simulation crashes. This is a crude 
way of warning you that you are about to let your hardware 
5 evaluate a nonsense result . 

The relation with untimed block firing rules will allow to 
do a timed simulation which consist partly of signal 
flowgraph descriptions and partly of untimed basic blocks. 
10 The section ^^Timed simulations will treat this more into 
detail, 



Running a signal flowgraph by hand 



15 A signal flowgraph is only part of a timed description. The 
control component (an FSM) still needs to be introduced. 
There can however be situations in which you would like to 
run a signal flowgraph directly. For instance, in case you 
have no control component, or if you have not yet developed 

20 a control description for it. 

The "sfg" member function "runO" performs the execution of 
the signal flowgraph as described above. An example is used 
to demonstrate this. 



#include "qlib . h" 



void mainO 
30 { 

_sig a ("a") ; 
_sig b{"b") ; 
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_sig c (df ix(2) ) ; 

dfbfix A ("A") ; 
dfbf ix B("B") ; 

5 

sfg add_two; 
add_two . starts ( ) ; 
a = b + c; 

add_two << "add_two"; 
10 add_two << ip(b, B) ; 

add_two << op (a. A) ; 

add_two , check ( ) ; 

15 B << dfix(l) << dfix(2); 



n // running silently 



Ui 



20 



add_two . eval ( ) ; 

cout << A.getO << "\n"; 

// running with debug information 
add_two . eval (cout ) ; 
cout << A.getO << "\n"; 



2 5 add_two . eval ( cout ) ; 

} 



When running this simulation, the following appears on the 
screen. 

30 

3.000000e+00 

add two( b 2) 
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a 



4 



= > 



a 



4 



4 . OOOOOOe+00 



add_t wo (Queue Underflow ® get in queue B 



5 



The first line shows the result in the first "evalO" call. 
When this call is given an output stream as argument, some 
additional information is printed during evaluation. For 
each signal flowgraph, a list of input values is printed, 
10 Intermediate signal values are printed after the at the 

beginning of the line. The output values as they are 
entered in the ouput queues are printed after the " = >" , 
Finally, the last line shows what happens when **eval()" is 
called when no inputs are available on the input queue ^'B" , 



For signal flowgraphs with registered signals, you must 
also control the clock of these signals. An example of an 
accumulator is given next. 

20 #include "qlib.h" 

void mainO 



15 



{ 



elk ck; 



25 



sig a("a",ck,dfix(0) ) ; 
sig b("b") ; 



dfbfix A ("A") ; 



30 



dfbfix B("B") ; 



sfg accu; 
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accu- starts {) ; 
a = a + b; 
accu << "accu"; 
accu << ip{b, B) ; 
5 accu << op (a. A) ; 

accu. check 0 ; 

B « dfix(l) « dfix(2) « dfix(3); 
while (B.getSize 0 ) 
10 { 

accu.eval (cout) ; 
accu. tick (ck) ; 

} 

} 

15 

The simulation is controlled in a while loop that will 
consume all input values in queue "B" . After each run, the 
clock attached to registered signal "a" is triggered. This 
is done indirectly through the "sfg" member call "tickO", 
20 that updates all registered signals that have been assigned 
within the scope of this "sfg" . Running this simulation 
results in the following screen ouput 
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The registered signal "a" has two values: a present value 
(shown left of "/")# ^nd a next value (shown right of "/")- 
When the clock ticks, the next value is copied to the 
5 present value. At the end of the simulation, registered 
signal "a" will contain 6 as its present value. The ouput 
queue "A" however will contain the 3, the "present value" 
of "a" during the last iteration. 

10 Finally, if you want to include a signal flowgraph in an 
untimed simulation, you must make shure that you implement 
a firing rule that guards the sfg evaluation. 

An example that incorporates the accumulator into an 
15 untimed basic block is the following. 

#include "qlib.h" 

class accu : public base 
20 { 

public : 

accu (char *name, dfbfix &i; dfbfix &o) ; 
int run ( ) ; 
private : 
25 dfbfix *ipq; 

dfbfix *opq; 
sfg _accu; 
elk ck; 

} 

30 

accu: : accu (char *name, dfbfix &i, dfbfix &o) : base(name) 
{ 



ipq = i . asSource (this) ; 
opq = o.asSink (this) ; 



_sig a("a",ck,dfix(0) ) ; 
_sig b("b") ; 
_accu. starts () ; 
a = a + b; 
_accu << "accu"; 
_accu << ip(b, *ipq) ; 
_accu << op(a, *opq) ; 
_accu . check ( ) ; 

} 

int accu: :run() 

{ 

if (ipq->getSize 0 < 1) 

return 0; 
_accu.eval () ; 
_accu. tick (ck) ; 

} 

In this example, the signal flowgraph _accu is included 
into the private members of class _accu. 

Globals and utility functions for signal flowgraphs 

The global variable "glbNumberOf Sfg" contains the number of 
"sfg" objects that you have constructed in your present 
OCAPI program. Given an "sfgO" object, you have also a 
number of utility member function calls. 



• "getnatneO" returns the "char *" name of the signal 
flowgraph. 

• "merge ()" joins two signal flowgraphs. 

• "getisig{int n) " returns a "sig *" that indicates which 
signal corresponds to input number "i" of the signal 
flowgraph. If 0 is returned, this input does not exist. 

• "getiqueue (int n) " returns the queue ("dfbfix *") 
assigned to input number "i" of the signal flowgraph. 
If 0 is returned, then this input does not exist. 

• "getosig(int n) " returns a "sig *" that indicates which 
signal corresponds to output number ''i" of the signal 
flowgraph. If 0 is returned, this output does not 
exist . 

• "getoqueue (int n) " returns the queue ("dfbfix *") 
assigned to output number "i" of the signal flowgraph. 
If 0 is returned, then this output does not exist. 

You should keep in mind that a signal flowgraph is a data 
structure. The source code that you have written helps to 
build this data structure. However, a signal flowgraph is 
not executed by running your source code. Rather, it is 
interpreted by OCAPI . You can print this data structure by 
means of the "eg (ostream) " member call. 

For example, if you appended 

accu. eg (cout ) ; 

to the "running-an-sfg-by-hand" example, then the following 
output would be produced: 
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sfg accu 



outputs 



inputs 



{ b_2 } 
{ a_l } 



code { 



5 



a 1 



a_l_atl + b_2; 



}; 



Finite state machines 

10 With the aid of signals and signal flowgraphs, you are able 
to construct clock-cycle true data processing behavior. On 
top of this data processing, a control sequencing component 
can be added. Such a controller allows to execute signal 
flowgraphs conditionally. The controller is also the 

15 anchoring point for true timed system simulation, and for 
hardware code generation. A signal flowgraph embedded in an 
untimed block cannot be translated to a hardware processor: 
you have to describe the control component explicitly. 

20 The ctlfsm and state classes 

The controller model currently embedded in OCAPI is a 
Mealy- type finite state machine. This type of FSM selects 
the transition to the next state based on the internal 
25 state and the previous output value. 

In an OCAPI description, you use a "ctlfsm" object to 
create such a controller. In addition, you make use of 
"state" objects to model controller states. The following 
30 example shows the use of these objects. 



#include " "qlib.h* ' 
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void mainO 



5 



10 



15 



20 



25 



sfg dummy; 

dummy << ""dummy' * ; 

// create a finite state machine 
ctlfsm f; 

// give it a name 

f << ""theFSM* • ; 

// create 2 states for it 

state rst; 

state active; 

// give them a name 

rst << " "rst * ' ; 

active << ""active' ' ; 

// identify rst as the initial state of 

// ctlfsm f 

f << deflt (rst) ; 

// identify active as a plain state of ctlfsm 
If ^ 

f << active; 

// create an unconditional transition from 

// rst to active 

rst << all ways << active; 

// allways' is a historical typo and will be 
// replaced by "^always" in the future 



// create an unconditional transition from 
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// active to active, executing the dummy sfg. 
active << allways << dummy << active; 

// show what's inside f 
5 cout << f; 

} 

There are two states in this f sm, "rst" and "active" . Both 
are inserted in the fsm by means of the "<<" operator. In 

10 addition, the "rst" state is identified as the default 
state of the fsm, by embedding it into the "deflt" object. 
An fsm is allowed to have one default state. When the fsm 
is simulated, then the state at the start of the first 
clock cycle will be "rst" . In the hardware implementation, 

15 a "reset" pin will be added to the processor that is used 
to initialize the fsm's state register with this state. 

Two transitions are defined. A transition is written 
according to the template: starting state, conditions, 
20 actions, target state, all of this separated by the "<<" 
operator. The condition "allways" is a default condition 
that evaluates to true. It is used to model unconditional 
transitions. 

25 The last line of the example shows a simple operation you 
can do with an fsm. By relating it to the output stream, 
the following will appear on the screen when you compile 
and execute the example. 



30 



digraph g 

{ 

rst [shape=box) ; 
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rst->active; 



active->active; 



5 This output represent a textual format of the state 
transition diagram. The format is that of the "dotty" tool, 
which produces a graphical layout of your state transition 
diagram. 

"dotty" is commercial software available from AT&T. 

10 

You cannot simulate a "ctlfsm" object on itself. You must 
do this indirectly through the "sysgen" object, which is 
introduced in the section "Timed Simulations" . 

15 The end class 

Besides the default condition "allways", you can use also 
boolean expressions of registered signals. The signals need 
to be registered because we are describing a Mealy- type 
20 fsm. You construct conditions through the "end" object, as 
shown in the next example. 

#include "qlib.h" 

25 void mainO 



elk ck; 



30 



sig a("a",ck, dfix{0)); 
sig b("b",ck, dfix(O) ) ; 
sig a_input ("a") ; 
s ig b_input ( " a " ) ; 



dfbf ix A ("A") ; 
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dfbfix B("B") ; 

I 

sfg some_operation; 

// some operations go here . . . 

5 

sfg readcond; 
readcond . starts ( ) ; 
a = a_input; 
b - b_input; 
10 readcond << "readcond"; 

readcond « ip(a_input,A) ; 
readcond << ip (binput , B) ; 
readcond . check ( ) ; 

15 // create a finite state machine 

ctlfsm f; 
f << "theFSM"; 

state rst; 
20 state actives- 

state wait; 

rst « "rst"; 
active << "active"; 
25 wait << "wait"; 

f << deflt (rst) ; 
f << active; 
f << wait; 



30 



rst << allways << readcond << active; 
active << _cnd{a) << readcond << some_operation 
<< wait; 
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wait 



<< ( end (a) ScSc cnd(b)) << readcond 



<< wait; 



wait 



<<(!_cnd(a) | | !_cnd (b) ) <<readcond<< active; 



A FAQ is why condition signals must be registers, and 
whether they can be plain signals also. The answer is 
simple: no, they can't. The fsm control object is a stand- 
alone machine that must be able to 'boot' every clock 

10 cycle. During one execution cycle, it will first select the 
transition to take (based on conditions) , and then execute 
the signal flowgraphs that are attached to this transition. 
If "immediate" transition conditions had to be expressed, 
then the signals should be read in before the fsm 

15 transition is made, which is not possible: the execution of 
an sfg can only be done when a transition is selected, in 
other words: when the condition signals are known. Besides 
this semantical consideration, the registered-condition 
requirement will also prevent you from writing 

20 combinatorial control loops at the system level. 

The first signal flowgraph '^readcond" takes care of reading 
in two values ''a" and "b" that are used in transition 
conditions. The sfg reads the signals "a" and "b" in 
25 through the intermediate signals "a_input" and "b_input" . 
This way, "a" and "b" are explicitly assigned in the signal 
flowgraph, and the semantical check "readcond. check () " will 
not complain about unassigned signals. 

30 The fsm below it defines three states. Besides an initial 
state "rst" and an operative state "active", a wait state 
"wait" is defined, that is entered when the input signal 
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"a" is high. This is expressed by the "_cnd(a)" transition 
condition in the second 'fsm transition. You must use 
"_cnd()" instead of "cndO" because of the same reason that 
you must use "_sig()" instead of "sigO": The underscore- 
5 type classes are empty boxes that allocate the objects that 
do the real work for you. This allocation is dynamic and 
independent of the C++ scope. 

Once the wait state is entered, it can leave it only when 
10 the signals "a" or "b" go low. This is indicated in the 
transition condition of the third fsm transition, A 
operator is used to express the and condition. If the 
signals "a" and "b" remain high, then the wait state is not 
left. The transition condition of the last transition 
15 expresses this. It uses the logical not and logical or 

"II" operators to express this. 

The "readcond" signal flowgraph is executed at all 
transitions. This ensures that the signals "a" and "b" are 
20 updated every cycle. If you fail to do this, then the value 
of "a" and '"b" will not change, potentially creating a 
deadlock. 

To summarize, you can use either "always" or a logical 
25 expression of "_cnd{)" objects to express a transition 
condition. The signals use in the condition must be 
registers. This results in a Mealy-type fsm description 

Utility functions for fsm objects 

30 

A number of utility functions on the "ctlfsm" and "state" 
classes are available for query purposes. This is only 
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minimal: The objects are intended to be manipulated by the 
cycle scheduler and code generators. 

sfg action; 
5 ctlfsm f; 
state si; 
state s2; 



f « deflt (si) ; 
10 f << s2; 

=^ si << allways << s2; 

'O s2 << allways << action << si; 

^ 15 // run through all the state in f 

Ln statelist *r; 

^ for (r = f .first; r; r = r->next) 

I 20 } 

M: // print the nuymber of states in f, 

II print the number of transitions in f , 
// print the name of f, 
// print the number of sfg's in f 
25 cout << f .numstates 0 << ^"\n**; 

cout << f .numtransitionsO << ""\n''; 
cout << f.getnameO << ""\n**; 
cout << f .numactions 0 << ""\n''; 



30 



// print the name of a state 
cout << sl.getnameO « ^^\n*'; 
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The basic block for timed simulations 

I 

Using signals, signal flowgraphs, finite state machines and 
states, you can construct a timed description of a block. 
5 Having obtained such a description, it is convenient to 
merge it with the untimed description. This way, you will 
have one class that allows both timed and untimed 
simulation. Of course, this merging is a matter of writing 
style, and nothing forces you to actually have both a timed 
10 and untimed description for a block. 

The basic block example, that was introduced in the section 
"The basic block", will now be extended with a timed 
version. As before, both an include file and a code file 
15 will be defined. The include file, "add.h", looks like the 
following code. 

#ifndef ADD_H 
#define ADD H 



20 



#include ""qlib.h 



class add 



public base 



25 



public : 



add (char *name, FB & _inl, FB & in2, FB & ol) ; 



// untimed 



int run ( ) ; 



30 



// timed 

void define ( ) ; 



ctlfsm &fsm() {return _fsm}; 
private: 

FB *inl; 
FB *in2; 
FB *ol; 
ctlfsm _fsTn; 
sfg _add; 
state _go; 

}; 

#endif 

The private members now also contain a control fsm object, 
in addition to signal flowgraph objects and states. If you 
feel this is becoming too verbose, you will find help in 
the section "Faster description using macros", that defines 
a macro set that significantly accelerates description 
entry. 

In the public members, two additional member functions are 
declared: the "define ()" function, which will setup the 
timed description data structure, and the "fsmO", which 
returns a pointer to the fsm controller. Through this 
pointer, OCAPI accesses everything it needs to do 
simulations and code generation. 

The contents of the adder block will be described in 
"add. cxx" . 

#include ""add.h* ' 

add: :add(char *name, FB & _inl, FB & _in2, FB & ol) : 
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base (name) 

{ 



inl = _inl .asSource (this) ; 
in2 = _in2 .asSource (this) ; 
ol = _ol.asSink (this) ; 
define () ; 



int add : : run ( ) 
10 { 



} 



void add::define() 
15 { 

_sig il (""il' *) 
_sig i2Ci2' ') 
_sig ot ( " ^ot • ' ) 



20 



25 



_add « ""add* ' ; 

_add. starts () ; 

ot = il + i2; 

_add << ip(il, *inl) ; 

_add << ip(i2, *in2) ; 

_add << op(ot, *ol) ; 



_f sm << ^ "f sm' • ; 
_go << " ^go ' ' ; 



30 



fsm << deflt(_go); 

go << all ways << _add << _go; 
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If the timed description • uses also registers, then a 
pointer to the global clock must also be provided (OCAPI 
generates single-clock, synchronous hardware) . The easiest 
5 way is to extend the constructor of "add" with an 
additional parameter "elk &ck" , that will also be passed to 
the "define" function. 

Timed simulations 

10 

By obtaining timed descriptions for you untimed basic 
block, you are now ready to proceed to a timed simulation, 
A timed simulation differs from an untimed one in that it 
proceeds clock cycle by clock cycle. Concurrent behavior 
15 between different basic blocks is simulated on a cycle-by- 
cycle basis. In contrast, in an untimed simulation, this 
concurrency is present on an iteration by iteration basis. 

The sysgen class 

20 

The "sysgen" object is for timed simulations the equivalent 
of a "scheduler" object for untimed simulations. In 
addition, it also takes care of code and testbench 
generation, which explains the name. 

25 

The sysgen class is used at the system level. The timed 
"add" class, defined in the previous section, is used as an 
example to construct a system which uses untimed file 
sources and sinks, and a timed "add" class, 

30 

#include ""qlib.h' • 
#include ""add.h' ' 



void mainO 

{ 

dfbfix il("il") 
dfbfix i2("i2") 
dfbfix olC'ol") 



src SRCl ("SRCl", il,"SRCl"); 
src SRC2("SRC2", i2,"SRC2"); 
add ADD ("ADD" , il, i2, ol) ; 
snk SNKl ( " SNKl " , ol , " SNKl " ) ; 



sysgen SI ("SI") ; 



SI « SRCl; 

SI << SRC2; 

SI << ADD.f sm() ; 

SI « SNKl; 

SI . setinfo (verbose) ; 

elk ck; 

int i; 

for (i=0; i<3; i++) 
{ 

SI .run{ck) ; 

} 

} 



The simulation is set up as before with queue objects and 
basic blocks. Next, a "sysgen" object is created, with name 
"SI" . All basic blocks in the simulation are appended to 
the "sysgen" objects by means of the $<<$ operator. If a 
timed basic block is to be used, as for instance in case of 
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the "add" object, then the "fsmO" pointer must be 
presented to ''sysgen" rather then the basic block itself. A 
"sysgen" object knows how to run and combine both timed and 
untimed objects. For the description shown above, untimed 
versions of the file sources and sink "src" and "snk" will 
be used, while the timed version of the "add" object will 
be used. 

Next, three clock cycles of the system are run. This is 
done by means of the "run(ck)" member function call of 
"sysgen" . The clock object "ck" is, because this simulation 
contains no registered signals, a dummy object. When 
running the simulator executable with stimuli file contents 

SRCl SRC2 -- not present in the file 
not present in the file 

1 4 

2 5 

3 6 

you see the following appearing on the screen, 

*** INFO: Defining block SRCl 
*** INFO: Defining block SRC2 
*** INFO: Defining block ADD 
*** INFO: Defining block SNKl 
fsm fsm: transition from go to go 
addttO 
add#l 

in il 1 

in i2 4 

sig ot 5 
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out ' ot 5 

fsm fsm: transition from go to go 
add#0 
add#l 

5 in il 2 

in i2 5 
sig ot 7 
out ' ot 7 

fsm fsm: transition from go to go 
10 add#0 

add#l 
in il 3 
in i2 6 
sig ot 9 
15 out' ot 9 

The debugging output produced is enabled by the "setinfoO" 
call on the "sysgen" object. The parameter "verbose" 
enables full debugging information. For each clock cycle, 

20 each fsm responds which transition it takes. The fsm of the 
"add" block is called "fsm", an as is seen it makes 
transitions from the single state "go" to the obvious 
destination. Each signal flowgraph during this simulation 
is executed in two phases (below it is indicated why) . 

25 During simulation, the value of each signal is printed. 

Selecting the simulation verbosity 



30 



The "setinfo" member function call of "sysgen" selects the 
amount of debugging information that is produced during 
simulation. Four values are available: 
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• "silent" will cause no output at all. This can 
significantly speed up your simulation, especially for 
large systems containing several hundred of signal 
f lowgraphs . 

5 • "terse" will only print the transitions that fsm's make. 

• "verbose" will print detailed information on all signal 
updates . 

• "regcontents" will print a list the values of registered 
signals that change during the current simulation. This 

10 is by far the most interesting option if you are 

debugging at the system level: when nothing happens, for 
instance when all your timed descriptions are in some 
"hold" mode, then no ouput is produced. When there is a 
lot of activity, then you will be able to track all 

15 registered signals that change. 



This example is part of a simulation containing 484 
registerd signals and 483 signal f lowgraphs. Using 
"setinfo (verbose) " here might require a good text editor to 
20 see what is happening - if anything will happen before your 
quota is exceeded. 

For instance, the code fragment 

25 sysgen S (""S' * ) ; 

S . setinfo (regcontents) ; 



int cycle; 

for (cycle=0; cycle < ICQ; cycle++) 
30 { 

cout << Cycle << cycle << ""\n''; 
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S . run (ck) ; 





can produce 


an output as shown 


below. 




5 














> 


Cvcle 18 














coe f _ram_i r_2 


0 


1 








copy_step_f lag 


1 


0 








ext_ready_out 


1 


0 


10 






pc 


15 


16 








step_f lag 


1 


0 




> 


Cycle 19 














coe f _r am_i r_2 


1 


0 








coef wr adr 


12 


13 


15 






holdjpc 


0 


16 








pc 


16 


17 








pc_ctl_ir_l 


1 


0 




> 


Cycle 20 














step_clock 


0 


1 


20 


> 


Cycle 21 














copy_s t ep_f 1 ag 


0 


1 








prev_step_clock 


0 


1 








step_f lag 


0 


1 



25 Three phases are better 



Although you will be saved from the details behind two- 
phase simulation, it is worthwhile to see the motivation 
behind it. 

When you run an "sfg" "by hand" using the "runO" method of 
an **sfg" , the simulation proceeds in one phase: read 
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inputs, calculate, produce ouput. The "sysgen" object, on 
the other hand, uses a two-phase simulation mechanism. 

The origin is the following. In the presence of feedback 
5 loops, your system data flow simulation will need initial 
values on the communication queues in order to start the 
simulation. However, the code generator assumes the 
communication queues will translate to wiring. Therefore, 
there will never be storage in the implementation of a 

10 communication queue to hold these intitial values. OCAPI 
works around this by producing these initial values at 
runtime. This gives rise to a three-phase simulation: in 
the first phase, initial values are produced, while in the 
second phase, they are consumed again. This process repeats 

15 every clock cycle. 

The three-phase simulation mechanism is also able to detect 
combinatorial loops at the system level. If there exists 
such a loop, then the first phase of the simulation - will 

20 not produce any initial value on the system interconnect. 
Consequently, in the last phase there will be at least one 
signal flowgraph that will not be able to complete 
execution in the current clock cycle. In that case, OCAPI 
will stop the simulation. Also, you get a list of all 

25 signal flowgraphs that have not completed the current clock 
cycle, in addition to the queue statistics that are 
attached to these signal flowgraphs. 

Hardware code generation . 

30 

OCAPI allows you to translate all timed descriptions to a 
synthesizable hardware description. 



# 
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• For each timed description, you get a datapath ".dsfg" 

file, that can be entered into the Cathedral -3 datapath 
synthesis environment, converted to VHDL and 
5 postprocessed by Synopsys-dc logic synthesis. 

• For each timed description, you also get a controller 

"-dsfg" file, which is synthesized through the same 
environment. 

• You also get a glue cell, that interconnects the 
10 resulting datapath and controller VHDL file. 

• You get a system interconnect file, that integrates all 
glue cells in your system. For this system interconnect 
file, you optionally can specify system inputs and 
outputs, scan chain interconnects, and RAM 

15 interconnects. The file is VHDL. 

• Finally, you also get debug information files, that 
summarize the behavior of and ports on each processor. 

Untimed blocks are not translated to hardware. The use of 
20 the actual synthesis environments will not be discussed in 
this section. It is assumed to be known by a person skilled 
in the art . 

The generate 0 call 

25 

The member call "generate ()" performs the code generation 
for you. In the adder example, you just have to add 

SI .generate () ; 

30 

at the end of the main function. If you would compile this 
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description, and run it, then you would see things are not 
quite OK: 

*** INFO: Generating Systen Link Cell 
5 *** INFO: Component generation for SI 

*** INFO: C++ currently defines 5 sig, 4 _sig, 1 sfg. 
*** INFO: Generating FSMD fsm 
*** INFO: FSMD fsm defines 1 instructions 
DSFGgen: signal il has no wordlength spec. 
10 DSFGgen: signal i2 has no wordlength spec. 
DSFGgen: signal ot has no wordlength spec. 
DSFGgen: not all signals were quantized. Aborting. 
*** INFO: Auto-cleanup of sfg 

15 Indeed, in the adder example up to now, nothing has been 
entered regarding wordlengths. During code generation, 
OCAPI does quite some consistency checking. The general 
advice in case of warnings and errors is: If you see an 
error or warning message, investigate it. When you 

20 synthesize code that showed a warning or error during 
generation, you will likely fail in the synthesis process 
too. 

The "add" description is now extended with wordlengths. 8 
25 bit wordlengths are chosen. You modify the "add" class to 
include the following changes. 

void add: :def ine {) 



30 



df ix wl (0,8,0) ; 



sig il Cil * ' , wl) ; 
sig i2 (""i2 • ' , wl) ; 
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_sig ot ( " "ot ' ' , wl ) ; 

} 

After recompiling and rerunning the OCAPI program, you now 
see : 

*** INFO: Generating Systen Link Cell 

*** INFO: Component generation for SI 

*** INFO: C++ currently defines 5 sig, 4 _sig, 1 sfg. 

*** INFO: Generating FSMD fsm 

*** INFO: FSI^ fsm defines 1 instructions 

*** INFO: C++ currently defines 31 sig, 21 _sig, 3 sfg. 

*** INFO: Auto- cleanup of sfg 

In the directory where you ran this, you will find the 
following files: 

• "f sm_dp.dsfg" , the datapath description of "add" 

• "f sm_f sm.dsfg" , the controller description of "add" 

• "fsm.vhd", the glue cell description of add 

• "Sl.vhd", the system interconnect cell 

• "fsm. ports", a list of the I/O ports of "add". 

The glue cell "fsm.vhd" has the fpl lowing contents (only 
the entity declaration part is shown) . 

Cath3 Processor for FSMD design fsm 

library IEEE; 

use IEEE. std_logic_1164 .all ; 
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entity fsm is 

port ( 



reset: in std_logic; 
elk: in std_logic; 

il: in std_logic_vector ( 7 downto 0 ); 
i2: in std_logic_vector ( 7 downto 0 ); 
ot : out std_logic_vector ( 7 downto 0 ) 
); 



10 end fsm; 



Each processor has a reset pin, a clock pin, and a number 
of I/O ports, depending on the inputs and ouputs defined in 
the signal flowgraphs contained in this processor. All 

15 signals are mapped to "std_logic" or "std_logic_vector" , 
The reset pin is used for synchronous reset of the embedded 
finite state machine. If you need to initialize registered 
signals in the datapath, then you have to describe this 
explicitly in a signal flowgraph, and execute this upon the 

20 first transition out of the initial state. 

The "fsm. ports" file, indicates which ports are read in in 
each transition. In the example of the "add" class, there 
is only one transition, which results in the following 
25 ports" file 

ic-k idcfdcifk** SFG fsmgogoO ********** 

Port # I/O Port Q 

1 I il il 

30 2 I i2 i2 

1 O ot ol 
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The name of an input or output signal is used as a port 
name, while the name of the tqueue associated to it relates 
to the system net name that will be connected to this port. 

5 System cell refinements 

The system link cell incorporates all glue cells of your 
current timed system description. These glue cells are 
connected if they read/write from the same system queue. 
10 There are some refinements possible on the "sysgen" object 
that will also allow you to indicate system level inputs 
and ouputs, scan chains, and RAM connections. 

System inputs and ouputs are indicated with the "inpadO" 
15 and "outpadO" member calls of "sysgen" . In the example, 
this is specified as 

sysgen SI ( " "SI • • ) ; 

20 

dfix b8 (0, 8,0) ; 

Sl.inpad(il, b8) ; 
Sl.inpad(i2, b8) ; 
25 SI .out:pad(ol, b8) ; 

Making these connections will make the "il", "12", "ol" 
signals appear in the entity declaration of the system cell 
"SI". The entity declaration inside of the file "Sl.vhd" 
30 thus looks like 



entity SI is 
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port ( 



reset: in std_logic; 
elk: in std_logic; 



5 



il: in std_logic_vector ( 7 downto 0 ); 
±2 : in std_logic_vector ( 7 downto 0 ) ; 
ol : out std_logic_vector ( 7 downto 0 ) 



) ; 



end SI; 



10 Scan chains can be added at the system level, too. For each 
scan chain you must indicate which processors it should 
include. Suppose you have three basic blocks (including a 
timed description and registers) with names "BLOCKl", 
"BL0CK2", "BLOCKS". You attach the blocks to two scan 

15 chains using the following code. 

scanchain SCANl ( "scanl" ) ; 
scanchain SCAN2 ( " scan2 " ) ; 

20 SCANl. addscan (& BLOCKl. fsmO); 
SCANl . addscan ( & BL0CK2 . f sm ( ) ) ; 
SCAN2. addscan (& BL0CK3. fsmO); 

The "sysgen" object identifies the required scan chain 
25 connections through the "fsm" objects that are assigned to 
it. In order to have reasonable circuit test times, you 
should not include more then 300 flip-flops in each scan 
chain. If you have a processor that contains more then 3 00 
flip-flops, then you should use another scan chain 
i30 connection strategy. 



Finally, you can generate code for the standard untimed 
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block RAM. There are two possible interconnection 
mechanisms: the first will include the untimed RAM blocks 
in "sysgen" as internal components of the system link cell. 
The second will include the RAM blocks as external 
5 components. This latter method requires you to construct a 
new "system-system link cell", that includes the RAM 
entities and the system link cell in a larger structure. 
However, it might be required in case you have to remap the 
standard RAM interface, or introduce additional 
10 asynchronous timing logic. 

An example of the two methods is. shown next 

ram RAMI ("rami" , addrl, dil, dol, wr, rd, 128); 
15 ram RAM2 ( "ram2" , addr2 , di2, do2 , wr, rd, 128); 

// types of address and data bus 
dfix addrtype(0, 7, 0); 
dfix dattype (0, 4, 0); 

20 

sysgen S1(""S1' •) ; 

// define an external ram 

SI . extern_ram (RAMI , addrtype , dattype) ; 

25 

// define an internal ram 

SI . intern_ram(RAM2, addrtype, dattype) ; 

Pitfalls for code generation 

30 

As always, there are a number of pitfalls when things get 
con^lex. You should watch the following when diving into 
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code generation. 

OCAPI generates nicely formatted code, that you can 
investigate. To help you in this process, also the actual 
5 signal names that you have specified are regenerated in the 
VHDL and DSFG code. This impl ies that you have to stay away 
from VHDL and DSFG keywords, or else you will get an error 
from either Cathedral-3 or Synopsys. 

10 The mapping of the fixed point library to hardware is, in 
the present release, minimal- First of all, although 
registered signals allow you to specify an initial value, 
you cannot rely on this for the hardware circuit. 
Registers, when powered on, take on a random state. 

15 Therefore, make sure that you specify the initialization 
sequence of your datapath. A second fixed point pitfall is 
that the hardware support for the different quantization 
schemes is lacking. It is assumed that you finally will use 
truncated quantization on the Isb-side and wrap-around 

20 quantization on the msb-side of all signals. The other 
quantization schemes require additional hardware to be 
included. If you really need, for instance, saturated msb 
quantization, then you will have to describe it in terms of 
the default quantization. 

25 

Finally, the current set of hardware operators in 
Cathedral-3 is designed for signed representations. They 
work with unsigned representations also as long as you do 
no use relational operations (<, > and the like) . In this 
30 last case, you should implement the unsigned operation as a 
signed one with one extra bit. 



93 



Verification and testbenches 

Once you have obtained a gate level implementation of your 
circuit, it is necessary to verify the synthesis result. 
5 OCAPI helps you with this by generating testbenches and 
testbench stimuli for you while you run timed simulations 
and do code generations. 

The example of the "add" class introduced previously is 
10 picked up again, and testbench generation capability is 
included to the OCAPI description. 

Generation of testbench vectors 

15 The next example performs a three cycle simulation of the 
"add" class and generates a testbench vectors for it. 

#include "qlib.h" 

20 void mainO 



{ 



dfbfix il("il") 



dfbfix 12 ("12") 



dfbfix ol("ol") 



25 



src SRC1("SRC1 



II 



11, "SRCl") ; 



src SRC2("SRC2 



12, "SRC2") ; 



add ADD ("ADD" 



il, 12, ol) ; 



snk SNKl ("SNKl" , 



ol, "SNKl") ; 



30 



sysgen SI ("SI") ; 
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SI << SRCl; 
Sl « SRC2; . 
SI << ADD, f sm() ; 
SI << SNKl; 
5 ADD . f sm ( ) . tb_enable ( ) ; 

elk ck; 
int i; 

for (i=0; i<3; i++) 
10 SI .run(ck) ; 

ADD . f sm ( ) . tb_data ( ) ; 

} 

15 

Just before the timed simulation starts, you enable the 
generation of testbench vectors by means of a "tb_enable () " 
member call for each fsm that requires testbench vectors. 

20 During simulation, the values on the input and ouput ports 
of the "add" processor are recorded. After the simulation 
is done, the testbenches are generated using a "tb\_data{)" 
member function call. 

25 Testbench generation leaves three data files behind: 

• "fsm_tb.dat" contains binary vectors of all inputs of 
the "add" processor. It is intended to be read in by the 
VHDL simulator as stimuli. 
3 0 • "f sm_tb.dat_hex" contains hexadecimal vectors of all 
inputs and outputs of the "add" processor. It contains 
the output that should be produced by the VHDL simulator 
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when the synthesis was successful . 
• "f sm_tb.dat_info" documents the contents of the stimuli 
files by saying which stimuli vector corresponds to 
which signal 

5 

When compiling and running this OCAPI program, the 
following appears on screen . 

*** INFO: Defining block SRCl 
10 *** INFO: Defining block SRC2 
*** INFO: Defining block ADD 
*** INFO: Defining block SNKl 

*** INFO: Creating stimuli monitor for testbench of FSMD 
f sm 

15 *** INFO: Generating stimuli data file for testbench 
f sm_tb. 

*** INFO: Testbench fsm_tb has 3 vectors. 

Afterwards, you can take a look at each of the three 
20 generated testbenches. 

file: fsm_tb.dat 
00000001 00000100 
00000010 00000101 
25 00000011 00000110 

file: f sm_tb.dat_hex 

01 04 05 

02 05 07 

03 06 09 

30 file: f sm_tb.dat_info 

Stimuli for fsm tb contains 3 vectors for 
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il_stim read 
i2_stim . read 

Next columns occur only in _hex.dat file and are outputs 

5 

ol_stiTn write 

You can now use the vectors in the simulator. But first, 
you must also generate a testbench driver in VHDL. 

10 

Generation of testbench drivers 

To generate a testbench driver, simply call the 
"tb_enable 0 " member function of the "^add" fsm before you 
15 initiate code generation. You will end up with a VHDL file 
*^f sm_tb. vhd" that contains the following driver. 

-- Test Bench for FSMD design fsm 

20 library IEEE; 

use IEEE. std_logic_1164 .all; 

use IEEE . std_logic_textio . all ; 
use std. textio.all; 

25 

library clock; 

use clock. clock. all ; 

entity fsm__tb is 
30 end fsm_tb; 

architecture rtl of fsm tb is 
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signal reset: std_logic; 
signal elk: std^logic; 

signal il: std_logic_vector ( 7 downto 0 ) 
signal i2: std_logic_vector ( 7 dovmto 0 ) 
5 signal ot: std_logic__vector ( 7 downto 0 ) 

component f sm 
port ( 

reset: in std_logic; 
elk: in std_logic; 
10 il: in std_logic_vector ( 7 downto 0 ); 

i2 : in std_logic_vector ( 7 downto 0 ) ; 
ot : out std_logic_vector ( 7 downto 0 ) 
) ; 



end component; 



15 



begin 

crystal (elk, 50 ns) ; 
O f sm_dut : f sm 

port map { 
20 reset => reset, 

elk => elk, 

11 => il, 

12 => i2, 
ot => ot 

25 ) ; 

ini : process 
begin 

reset <= ' 1 ' ; 

wait until elk 'event and elk = '1* 
30 reset <= * 0 • ; 

wait ; 

end process; 



1 
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input : process 

file stimuli : text is in "fsTn_tb.dat"; 
variable aline : line; 
5 file stimulo : text is out "f sm_tb. sim_out " ; 

variable oline : line; 

variable v_il: std_logic_vector ( 7 downto 0 ); 
variable v_i2 : std_logic_vector ( 7 downto 0 ) ; 
variable v_ot : std_logic_vector ( 7 downto 0 ) ; 
10 variable v_il_hx: std_logic_vector ( 7 downto 0 ) 

variable v_i2_hx: std_logic_vector ( 7 downto 0 ) 
g variable v_ot_hx: std_logic_vector ( 7 downto 0 ) 

begin 

a i S 

SJ wait until reset 'event and reset = *0'; 

15 loop 
y1 if (not (endf ile (stimuli) ) ) then 

2 readline (stimuli, aline); 

^ read(aline, v_il) ; 

01 

D read(aline, v_i2) ; 

^ 20 else 

assert false 

report "End of input file reached" 

severity warning; 
end if; 
25 il <= v_il; 

12 <= v_i2; 
wait for 50 ns; 
V_Ot := ot; 
v_i l_hx : = v_i 1 ; 
30 v_i2_hx := v_i2; 

v_ot_hx := v_ot; 
hwrite (oline, v il hx) ; 
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write {oline, ' * ) ; 

hwrite (oline, Vj_i2_hx) ; 

write (oline, ' '); 

hwrite (oline, v_ot_hx) ; 
5 write (oline, ' •); 

writeline (stimulo, oline) ; 

wait until elk 'event and elk = '1'; 
end loop; 
end process; 
10 end rtl; 

configuration tbc_rtl of fsm_tb is 
for rtl 

for all : fsm 
15 use entity work. fsm (structure) ; 

end for; 
end for; 
end tbc_rtl; 

20 The testbench uses one additional library, ''clock", which 
contains the "crystal" component. This component is a 
simple clock generator that drives a 50% duty cycle elk. 

This testbench will generate a file "f sm_tb . sim_out" . After 
25 running the testbench in VHDL, this file should be exactly 
the same as the "f sm_tb.dat_hex" . You can use the unix 
"diff" command to check this. The only possible differences 
can occur in the first few simulation cycles, if the VHDL 
simulator initializes the registers to "X" . 

30 

Using automatic testbench generation greatly speedups the 
verification process. You should consider using it whenever 
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you are into code generation. 

Compiled code simulations 

5 For large designs, simulation speed can become prohibitive. 
The restricting factor of OCAPI is that the signal 
flowgraph data structures are interpreted at runtime. In 
addition, runtime quantization (fixed point simulation) 
takes up quite some CPU power. 

10 

OCAPI allows you to generate a dedicated C++ simulator, 
that runs compiled code instead of interpreted code. Also, 
additional optimizations are done on the fixed point 
simulation. The result is a simulator that runs one to two 
15 orders of magnitude faster then the interpreted OCAPI 
simulation. This speed increase adds up to the order of 
magnitude that interpreted OCAPI already gains over event - 
driven VHDL simulation. 

20 As an example, a 75Kgate design was found to run at 55 
cycles per second (on a HP/9000). This corresponds to . 1 
million" gates per second, and motivates why C++ is the way 
to go for system synthesis. 

25 Generating a compiled code simulator 

The compiled code generator is integrated into the ""sysgen" 
object. There is one member function, "compiled ()" , that 
will generate this simulator for you. 

30 



#include ""qlib.h* ' 
^include " "add . h ' ' 
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void mainO 

{ 



dfbf ix il ("il") 
dfbf ix i2 ("i2") 
dfbfix ol("ol") 
add ADD ("ADD" , il, i2, ol) ; 



10 



sysgen SI ("SI") ; 



SI << ADD. f sm() ; 



SI . compiled () ; 

} 

15 

In this simple example, a compiled code generator is made 
for a design containing only one FSM. The generator allows 
to include several fsm blocks, in addition to untimed 
blocks. 

20 

When this program is compiled and run, it leaves behind a 
file '*Sl_ccs . cxx" , that contains the dedicated simulator. 
For the OCAPI user, the simulator defines one procedure, 
"one_cycle 0 " , that simulates one cycle of the system. 

25 

When calling this procedure, it also produces debugging 
ouput similar to the "setinf o (regcontents) " call for 
'^ctlfsm" objects. This procedure must be linked to a main 
program that will execute the simulation. 

30 

If an untimed block is present in the system, then it will 
be included in the dedicated simulator. In order to declare 
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it, you must provide a member function "CCSdecl (of stream 
Ec) " that generates the required C++ declaration. As an 
example, the basic RAM block declares itself as follows: 

-- file: ram.h 

class ram : public base 

{ 

public : 



10 



ram (char * name, 
FB& __address, 
FB& _data_in, 
FB& _data_out, 
15 FB& _w, 

FB& _r, 
int _size) ; 
void CCSdecl (of stream &os) ; 

20 private: 

}; 



25 



-- file: ram.cxx 



void ram: .-CCSdecl (of stream &os) 
{ 

OS << " #include \"ram.h\"\n" ; 

OS << " ram " << typeNameO << "("; 

30 OS << "\"" << typeNameO << "\", 

OS << address .name 0 << "; 

OS << data in. name () << ", 
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OS << data_out .name 0 << "; 
OS << w.name'O << "; 
OS << r.nameO << "; 
OS << size << ");\n"; 

5 } 

This code enables the ram to reproduce the declaration by 
which it was originally constructed in the interpreted 
OCAPI program. Every untimed block that inherits from 
10 ''base", and that you whish to include in the compiled code 
simulator must use a similar "CCSdecl" function. 

Compiling and running a compiled code simulator 

15 The compiled code simulator is compiled and linked in the 
same way as a normal OCAPI program. You must however also 
provide a "main" function that drives this simulator. 

The following code contains an example driver for the "add" 
20 compiled code simulator. 

#include "qlib.h" 

void one_cycle(); 

25 extern FB il; 

extern FB i2; 
extern FB de- 
void mainO 

30 { 

11 « dfix(l) « dfix(2) << dfix(3) ; 

12 << dfix(4) « dfix(5) << dfix(6) ; 
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one_cycle { ) 
one_cycle ( ) ; 
one_cycle () ; 

while (ol .get Size {) ) 

cout << ol.getO << "\n"; 

} 

When run, this program will produce the same results as 
before. In contrast to the compiled simulaton of your MPEG- 
4 image processor, you will not be able to notice any speed 
increase on this small example. 

Faster coznm\mi cations 

OCAPI uses queues as a means to communicate during 
simulation. These queues however take up CPU power for 
queue management. To save this power, there is an 
additional queue type, "wireFB" , which is used for the 
simulation of point-to-point wiring connections. 

The dfbfix^wire class 

A "wireFB" does not move data. In contrast, it is related 
to a registered driver signal. At any time, the value read 
of this queue is the value defined by the registered 
signal. Because of this signal requirement, a "wireFB" 
cannot be used for untimed simulations. The following 
example of an accumulator shows how you can use a "wireFB" , 
or the equivalent "dfbf ix wire" . 
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#include "qlib.h" 
void mainO 

{ 

5 elk ck; 

_sig a ("a" ,ck,dfix(0) ) ; 
_sig b("b") ; 

10 dfbfix_wire A("A",a); 

dfbfix B("B") ; 

sfg accu; 
accu. starts () ; 
15 a = a + b; 

accu << "accu"; 
accu << ip (b, B) ; 
accu << op (a. A) ; 
accu. check 0 ; 

20 B « dfix(l) « dfix(2) << dfix(3) ; 

while (B.getSize 0 ) 

{ 

accu. aval (cout) ; 
accu. tick (ck) ; 

25 } 

} 



A "wireFB" is identical in use as a normal "FB" } . Only, for 
each "wireFB" , you indicate a registered driver signal in 
30 the constructor. 
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Interconnect strategies 

The "wireFB" object is related to the interconnect strategy 
that you use in your system. An interconnect strategy 
5 includes a decision on bus -switching, bus-storage, and bus- 
arbitration. OCAPI does not solve this problem for you: it 
depends on your application what the right interconnection 
strategy is. 

10 One default style of interconnection provided by OCAPI is 
the point-to-point, register driven bus scheme. This means 
that every bus carries only one signal from one processor 
to another. In addition, bus storage in included in the 
processor that drives the bus . 

15 

More complex interconnect strategies, like the one used in 
Cathedral-2, are also possible, but will have to be 
described in OCAPI explicitly. Thus, the freedom of target 
architecture is not without cost. In the section "Meta-code 
20 generation", a solution to this specification problem is 
presented. 

Meta-code generation 

25 OCAPI internally uses meta-code generation. With this, it 
is meant that there are code generators that generate new 
"fsm", "sfg" and "sig" objects which in turn can be 
translated to synthesizable code. 

30 Meta-code generation is ai powerful method to increase the 
abstraction level by which a specification can be made. 
This way, it is also possible to make parametrized 
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descriptions, eventually using conditions. Therefore, it is 
the key method of soft -chip components, which are software 
programs that translate themselves to a wide range of 
implementations, depending on the user requirements. 

5 

The meta-code generation mechanism is also available to the 
user. To demonstrate this, a class will be presented that 
generates an ASIP datapath decoder. 

10 An ASIP datapath idiom 

An ASIP datapath, when described as a timed description 
within OCAPI, will consist of a number of signal flowgraphs 
and a finite state machine. The signal flowgraphs express 
15 the different functions to be executed by the datapath. The 
fsm description is a degenerated one, that will use one 
transition per decoded instruction. The transition 
condition is expressed by the "instruction" input, and 
selects the appropriate signal flowgraph for execution. 

20 

Because the finite state machine has a fixed, but 
parametrizable structure, it is subject for meta-code 
generation. You can construct a ''decoder" object, that 
generates the "fsm" for you. This will allow compact 
25 specification of the instruction set. 

First, the "decoder" object (which is present in OCAPI) 
itself is presented. 



30 



-- the include file 
#define MAXINS 100 
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#include "qlib.h" 




10 

y 15 



20 



public : 

decoder (char *_name, elk &ck, dfbfix &_insq) ; 
void dec(int _numinstr) ; 
ctlf sm &f smO ; 
void dec(int _code, sfg &) ; 
void dec(int _code, sfg &, sfg &) ; 
void dec{int _code, sfg &, sfg Sc, sfg &) ; 
private : 

char *name; 
elk *ck; 
dfbfix *insq; 

int inswidth; 
int numinstr; 
int codes [MAXINS] ; 

ctlfsm _fsm; 
state active; 

sfg decode ; 
_sigarray *ir; 

end * deccnd(int ); 
void decchk(int ); 



}; 



- the .cxx file 
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#include "decoder. h" 

static int numbits(int w) 
5 { 

int bits = 0; 
while (w) 

{ 

bits++; 
10 w = w >> 1; 

} 

return bits; 

} 

15 int bitset(int bitnum, int n) 

{ 

return (n & (1 « bitnum) ) ; 

} 

decoder :: decoder (char *_name, elk &_ck, dfbfix &_insq) 
20 : base(_name) 

{ 

name = _name; 

insq = _insq.asSource (this) ; 
ck = &_ck ; 
25 numinstr = 0; 

inswidth = 0; 

_fsm << _name; 

// active << strapp (name, "_go_" ) ; 
30 active << "go"; 

fsm << def It (active) ; 
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void decoder :: dec ( int n) 
{ 

// define a decoder that decodes n instructions 
5 // instruction numbers are 0 to n-1 

// create also the instruction register 

if (I (n>0)) 

{ 

cerr << "*** ERROR: decoder " << name << " must 
10 have at least one instruction\n" ; 

exit (0) ; 

} 

inswidth = numbits (n-1) ; 
if (n > MAXINS) 
15 { 

cerr « "*** ERROR: decoder " << name << " 
exceeds decoding capacity\n"; 
exit (0) ; 

} 

20 

dfix bit (0, 1, 0,df ix: :ns) ; 

i r = new _s igar r ay ( ( char * ) s t r app ( name , " _i r " ) , 
inswidth, ck, bit) ; 
decode • starts { ) ; 
25 int i; 

SIGW(irw, dfix(0, inswidth, 0, dfix::ns)); 
for (i=0; i<inswidth; i++) 

{ 

if (i) 

30 (*ir) [i] = cast (bit, irw >> 

_sig{df ix(i, inswidth, 0,df ix: :ns) ) ) ; 
else 



Ill 

(*ir) [i] = cast (bit, irw) ; 

} 

decode << strapp ( "decod" , name) ; 
decode << ip(irw, *insq) ; 

} 

void decoder : :decchk (int n) 

{ 

// check if the decoder can decode this instruction 
int i ; 

if (linswidth) 

{ 

cerr << ERROR: decoder " << name << " must 

first define an instruction width\n"; 
exit (0) ; 

) 

if (n > {(1 « inswidth) -1) ) 
{ 

cerr << "*** ERROR: decoder " << name << " 
cannot decode code " << n << "\n";' 
exit (0) ; 

} 

for (i=0; i<numinstr; i++) 

{ 

if (n == codes [i] ) 

{ 

cerr << "*** ERROR: decoder " << name << " 
decodes code " << n << " twice\n" ; 
exit (0) ; 

} 

} 
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codes [numinstr] = n; 
numinstr++; 

} 

5 end *decoder : :deccnd (int n) 

{ 

// create the transition condition that corresponds 
// to the instruction number n 
int i; 

10 end *cresult = 0; 

if (bitset (0, n) ) 

cresult = &_cnd( (*ir) [0] ) ; 
else 

cresult = &(!_cnd((*ir) [0])); 

15 

for (i = 1; i < inswidth; i++) 

{ 

if (bitset(i, n) ) 

cresult = &(*cresult && _cnd ( (*ir) [i] ) ) ; 

20 else 

cresult = &(*cresult && !_cnd ( (*ir) [i] ) ) ; 

} 

return cresult; 



25 



} 



void decoder :: dec (int n, sfg &s) 

{ 

// enter an instruction that executes one sfg 
decchk(n) ; 

30 active << *deccnd(n) << decode << s << active; 

} 
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void decoder :: dec (int n, sfg &sl, sfg &s2) 
{ 

// enter an instruction that executes two sfgs 
decchk (n) ; 

active << *deccnd(n) << decode << si << s2 << 
active; 

} 

void decoder :: dec (int n, sfg &sl, sfg &s2, sfg &s3) 

{ 

// enter an instruction that executes three sfgs 
decchk (n) ; 

active << *deccnd(n) << decode << si << s2 << s3 << 
active; 

} 

ctlfsm & decoder :: fsm() 

{ 

return _fsm; 

} 

The main principles of generation are the following. Each 
instruction for the ASIP decoder is defined as a number, in 
addition to one to three signal flowgraphs that need to be 
executed when this instruction is decoded. The "decoder" 
object keeps track of the instruction numbers already used 
and warns you if you introduce a duplicate. When the 
instruction number is unique, it is split up into a number 
of instruction bits, and a fsm transition condition is 
constructed from these bits. 

The ASIP datapath at work 



114 

The use of this object is quite simple. In a timed 
description were you want to use the decoder instead of a 
plain "fsm", you inherit from this decoder object rather 
then from the "base" class. Next, instead of the fsm 
5 description, you give the instruction list and the required 
signal flowgraphs to execute. 



As an example, an add/subtract ASIP datapath is defined. We 
select addition with instruction number 0, and subtraction 
10 with instruction number 1. The following code (that also 
uses the supermacros) shows the specification. The 
inheritance to "decoder" also establishes the connection to 
the instruction queue. 



15 include file 

#ifndef ASIP_DP_H 
#define ASIP DP H 



class asip_dp ; public decoder 
20 { 

public : 

asip_dp (char *name, 
elk &ck, 
FB &ins, 

25 _PRT(inl), 

_PRT{in2) , 
_PRT(ol) ) ; 

private : 

PRT(inl) ; 

30 PRT(in2); 

PRT(ol ) ; 

}; 



10 



15 
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-- code file 

^include ^*asip_dp.h' ' 

df ix typ (0,8,0) ; 



asip_dp: :asip_dp 
elk &ck, 
FB &ins, 
_PRT(inl) , 
_PRT(in2) , 
PRT(ol) ) 



(char *name, 



: decoder (name, ck, ins), 
IS_SIG(inl, typ), 
IS_SIG(in2, typ) , 
IS_SIG(ol, typ) 



{ 



IS_IP(inl) ; 
IS_IP(in2) ; 
IS OP(ol) ; 



20 



25 



30 



SFG(add) 
GET(inl) 
GET(in2) 
ol = inl + in2; 
PUT(ol) ; 

SPG (sub) 
GET (inl) 
GET(in2) 
ol = inl - in2; 
PUT(ol) ; 



dec (2); // decode two instructions 
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dec(0, SFGID(add)); 
dec (1, SFGID.(sub) ) ; 

} 

5 To conclude, one can note that meta-code generation allows 
reuse of design "idioms" (classes) rather then design 
"instances" (objects) . Intellectual -property code 

generators are a direct consequence of this. 

10 

Pegcrxption of a design of Bvstems according to the method 
of the invention 

In the design of a telecommunication system 
15 (fig. lA) , we distinguish four phases: link design, 
algorithm design, architecture design and circuit design. 
These phases are used to define and model the three key 
components of a communication system: a transmitter, a 
channel model, and a receiver. 

20 

• The link design (1) is the requirement capture phase. 
Based on telecommunication properties such as 
transmission bandwidth, power, and data throughput (the 
link requirements) , the system design space is explored 

25 using small subsystem simulations. The design space 
includes all algorithms which can be used by a 
transmitter/receiver pair to meet the link requirements. 
Out of receiver and transmitter algorithms with an 
identical functionality, those with minimal complexity 

30 are preferred. Besides this exploration, any expected 
transmission impairment must also be modeled into a 
software channel model . 
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• The algorithm design (2) -phase selects and interconnects 
the algorithms identified in the link design phase. The 
output is a software algorithmic description in C++ of 

5 digital transmitter and receiver parts in terms of 
floating point operations. To express parallelism in the 
transmitter and receiver algorithms, a data-flow data 
model is used. Also, the transmission imperfections 
introduced by analog parts such as the RF front-ends are 
10 annotated to the channel model. 

• The architecture design (3) refines the data model of the 
transmitter or receiver. The target architectural style 
is optimized for high speed execution, uses distributed 

15 control semantics and pipeline mechanisms. The resulting 
description is a fixed point, cycle true C++ description 
of the algorithms in terms of execution on bit -parallel 
operators. The architecture design is finished with a 
translation of this description to synthesizable VHDL. 

20 

• Finally, circuit design (4) refines the bit-parallel 
implementation to circuit level, including technology 
binding, the introduction of test hardware, and design 
rule checks . 

25 

Target Architecture 

The target architecture (5), shown in figure 2, consists of 
a network of interconnected application specific 
30 processors. Each processor is made up of bit -parallel data- 
paths. When hardware sharing is applied, also a local 
control component is needed to perform instruction 



118 

sequencing. The processors are obtained by behavioral 
synthesis tools or RT level synthesis tools. In either 
case, circuits with a low amount of hardware sharing are 
targeted. The network is steered by one or multiple clocks. 
Each clock signal defines a clock region. Inside a clock 
region the phase relations between all register clocks are 
manifest. Clock division circuits are used to derive the 
appropriate clock for each processor. 

In between each processor, a hardware queue is present to 
transport data signals. They increase parallelism inside a 
clock region and maintain consistency between different 
streams of data arriving at one processor. 

Across clock region boundaries, synchronization interfaces 
are used. These interfaces detect the presence of data at 
the clock region boundary and gate clock signals for the 
clock region that they feed. This way, non-manifest and 
variable data rates in between clock regions are supported. 

The ensemble of clock dividers and handshake circuits forms 
a parallel scheduler in hardware, synchronizing the 
processes running on the bit -parallel processor. 

Overview of the C++ modeling levels 

An overview of the distinct C++ modeling levels used by 
OCAPI is given in figure 3. The C++ modeling spans three 
subsequent levels in the design flow: the link level, the 
algorithm level and the architecture level. The transition 
to the last level, the circuit level, is made by automated 
means trough code generation. Usually, VHDL is used as the 
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design language in this lowest level . 

I 

The link level is available through data-vector modeling. 
Using a design mechanism called parallelism scaling, this 
5 level is refined to the algorithm level. The algorithm 
level uses data-flow semantics. Using two distinct refining 
mechanisms in the data-flow level, we can refine this level 
to a register transfer level. 

10 The two refining mechanisms are clock cycle true modeling 
and fixed point modeling. Clock cycle true modeling is 
achieved by allocating cycle budgets and operators for each 
algorithm. To help the designer in this decision, operation 
profiling is foreseen. Fixed point modeling restricts the 

15 dynamic range of variables in the algorithms to a range for 
which a hardware operator can be devised. Signal statistics 
are returned by the design to help the designer with this. 

The last level, the architecture model, uses a signal 
20 flowgraph to provide a behavioral description. Using this 
description synthesizable code is generated. The resulting 
code then can be mapped onto gates using a register- 
transfer design tool such as DC of Synopsys, 

25 Data-vector modeling 

The upper level of representation of a communication system 
is the link level. It has the following properties : 



30 



It uses pure mathematical manipulation of functions. Time 
is explicitly manipulated and results in irregular- flow 
descriptions . 
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• It uses abstraction of all telecommunication aspects that 
are not relevant to the problem at hand. 

In this representation level, MATLAB is used for 
simulation. MATLAB uses the data-vector as the basic data 
object. To represent time functions in MATLAB, they are 
sampled at an appropriate rate. Time is present as one of 
the many vector dimensions. For example, the MATLAB vector 
addition 

a = b +c ; 

can mean both sequential addition in time (if the b and c 
vectors are thought of as time-sequential) , or parallel 
addition (if b and c happen to be defined at one moment in 
time) . MATLAB simply make no distinction between these two 
cases . 

Besides this time-space feature, MATLAB has a lot of other 
properties that makes it the tool -of -choice within this 
design level : 

• The ease with which irregular flow of data is expressed 
with vector operations. For example, the operation 
max(vector), or std(vector). 

• The flexibility of operations. A maximum operation on a 
vector of 10 elements or 1000 elements looks identically: 
max(vector). 

• The interactivity of the tool, and the transparency of 
data object management. 

• The extended library of operations, that allow very dense 
description of functionality. 

• Graphics and simulation speed. 

This data-vector restriction is to be refined to a data- 
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flow graph representation of the system. Definition of the 
data-flow graph requires .definition of all actors in the 
graph (actor contents as well as actor firing rules) and 
definition of the graph layout. 

In order to design systems effectively with the SOC++ 
design flow, a smooth transition between the data-vector 
level and the data-flow level is needed. A script to 
perform this task is constructed as can be seen in the 
following example. 

Example 1; design of a telecommunication system 
Initial data-vector description 

We consider a pseudonoise (PN) code correlator inside a 
direct sequence spread- spectrum (DS/SS) modem as an example 
(figure 4) . 

% input data 

in = [12 13 3 4 12] ; 

% spreading code 
c = II -1 1 -1] ; 

% correlate 

ot = corr (in, c) 

% find correlation peak 
[max, maxpos] = max (ot) ; 

A vector of input data in is defined containing 8 elements. 
These are subsequent samples taken from the chip 
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demodulator in the spread spectrum modem. The dimension of 
in thus corresponds to the time dimension. The input vector 
in is in principle infinite in length. For simulation 
purposes, it is restricted to a data set which has the same 
5 average properties (distribution) as the expected received 
data. 

The samples of in are correlated with the PN-code vector of 
length 4, c. The output vector ot thus contains 5 samples, 

10 corresponding to the five positions of in at which c can be 
aligned to. The max function locates the maximum value and 
position inside the correlated data. The position xnaxpos is 
subsequently used to synchronize the PN-code vector with 
the incoming data and thus is the desired output value of 

15 the algorithm. 

This code is an elegant and compact specification, yet it 
offers some open questions for the PN-correlator designer: 

• The algorithm has an implicit startup-effect. The first 
20 correlation value can only be evaluated after 4 input 

samples are available. From then on, each input sample 
yields an additional correlation value. 

• The algorithm misses the common algorithmic iteration 
found in digital signal processing applications: each 

25 statement is executed only once. 

• For the implementation, no statement is made regarding 
the available cycle budget. This is however an important 
specification for the attainable acquisition speed of the 
modem . 

30 All of these questions are caused by the parallelism of the 
data-vector description. 
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We now propose a way to make the parallelism of the 
operations more visible. Each of the MATLAB operations is 
easily interpreted. Inside the MATLAB simulation, the 
length of the operands will first be determined in order to 
5 select the correct operation behavior. For example, 

[mcuc, maxpos] = inax(ot) 

determines the maximum on a vector of length 5 (which is 
10 the length of the operand ot) . It needs at least 4 scalar 
comparisons to evaluate the result. If ot would for example 
have a longer length, more scalar comparisons would be 
needed. To indicate this in the description, we explicitly 
annotate each specific instance of the generic operations 
15 with the length of the input vectors. 



% input data 



in = 



[1 2 1 3 3 4 1 2] 



8 



20 



% spreading code 



c = 



[1 -1 1 -1] 



4 



25 % correlate 



ot 



corr 



(in, c) 



5 



8,4 



30 



% find correlation peak 
[max, xocucpos] = max, (ot) ; 

1 5 
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This little annotation helps us to see the complexity of 
the operations more clearly. We will use this when 
considering implementation of the description in hardware. 
It is of course not the intention to force a user to do 
5 this (MATLAB does this already for him/her) . 

When thinking about the implementation of this correlator, 
one can imagine different realizations each having a 
different amount of parallelism, that is, the mapping of 
10 all the operations inside corr() and max() onto a time/ space 
axis. This is the topic of the next section. 

Scaled description 

15 Consider again the definition of the PN code, as in: 

% spreading code 
c = [1-11 -1] ; 
4 

20 

This MATLAB description defines the variable c to be a 
data-vector containing 4 different values. This vector 
assignment corresponds to 4 concurrent scalar assignments. 
We therefore say that the maximal attainable parallelism in 
25 this statement is 4. 

In order to achieve this parallelism in the implementation, 
there must be hardware available to perform 4 concurrent 
scalar assignments. Since a scalar assignment in hardware 
30 corresponds to driving a data bus to a certain state, we 
need 4 busses in the maximal parallel implementation. If 
only one bus would be desired, then we would have to 
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indicate this. For each of the statements inside the MATLAB 
description, a similar story can be constructed- The 
indication of the amount of parallelism is an essential 
step in the transition from data-vectors to data-flow. We 
5 call this the scaling of parallelism. It involves a 
restriction of the unspecified communication bandwidth in 
the MATLAB description to a fixed number of communication 
busses. It is indicated as follows in the MATLAB 
description. 

10 

% input data 

in = [12 13 3 4 12]; 
15 8®1 

% spreading code 
c = [1-11 -1] ; 
4@4 

20 

% correlate 

ot = corr {in, c) 

5®1 8,4 

25 % find correlation peak 

[max, maxpos] = max (ot) ; 

1®1 5 

30 As is seen, each assignment is extended with a ®i 
annotation, that indicates how the parallelism in the data 
vectors is ordened onto a time axis. For example, the 8 
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input values inside in are provided sequentially by writing 
8®1. The 4 values of c on the other hand, are provided 
concurrently. We see that, whatever implementation of the 
corr operation we might use, at least 8 iterations will be 
5 required, simply to provide the data to the operation. 

At this moment, the description is getting closer to the 
data-flow level, that uses explicit iteration. One more 
step is required to get to the data flow graph level. This 
10 is the topic of the next section. 

Data flow graph definition 

In order to obtain a graph, the actors and edges inside 
15 this graph must be defined. Inside the annotated MATLAB 
description, data precedences are already present through 
the presence of the names of the vectors . The only thing 
that is missing is the definition of actor boundaries; 
edges will then be defined automatically by the data 
20 precedences going across the actor boundaries. 

This can be done by a new annotation to the MATLAB 
description. Three actors will be defined in the DS/SS 
correlator. 



25 



actorl { 



% input data 



in =5 



[1 2 1 3 3 4 1 2] 



8®1 



30 } 



actor2 { 
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% spreading code 

c = [1-11 -1] ; 

4@4 
% correlate 
5 ot = corr (in, c) 

5@1 8,4 

} 

actor3 { 
10 % £ind correlation peak 

[max, maxpos] = max (ot) ; 

1®1 5 

} 

15 Again the annotation should be seen as purely conceptual; 
it is not intended for the user to write this code. Given 
these annotations, a data flow graph can be extracted from 
the scaled MATLAB description in an unambiguous way. 

• actorl is an actor with no input, and one output, Called 
20 in. 

• actor2 is an actor with 1 input in and one output ot. 

• actor3 is an actor with 1 input ot and outputs maxpos and 
max. 



25 Furthermore, the simulation uses queues to transport 
signals in bietween the actors. We need three queues, called 
in, ot and maxpos. 



The missing piece of information for simulation of this 
30 dataflow graph are the firing rules (or equivalently the 
definition of productions and consumptions on each edge) . A 
naive data flow model is shown in figure 4 : actorl (10) 
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produces 8 values, which are correlated by actor2 (11) , 
while the maximum is selected inside actorS (12) . 

This would however mask the parallelism scaling operation 
5 inside the MATLAB description. For example, it was chosen 
to provide the 8 values of the in vector in a sequential 
way over a parallel bus. It is believed that the multi-rate 
SDF model therefore is not a good container for the 
annotated MATLAB description. 

10 

Another approach is a cyclostatic description. In this case 
we have a graph as in figure 5 . 

We see that the determination of production patterns 
involves examining the latencies of operations internal to 
15 the actor. This increases the complexity of the design 
script. It is simpler to perform a demand driven scheduling 
of all actors. The firing rule only has to examine the 
availability of input tokens. 

20 The desired dataflow format as in figure 6 is thus situated 
in between the multirate SDF level and the cyclostatic SDF 
level. It is proposed to annotate consumptions and 
productions in the same way as it was written down in the 
matlab description: 

25 • 8®1 is the production of actorl. It means: 8 samples are 
produced one at a time. 

• 8@1 and 5@1 is the consumption and production of actor2 
respectively. 

• 5®1 and 1©1, 1®1 are the consumption and productions for 
30 actor3 . 



Data-flow simulation 
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Given an annotated matlab description, a simulation can now 
be constructed by writing . a high-level model for each 
actor, interconnecting these with queues and constructing a 
system schedule. OCAPI provides both a static scheduler and 
a demand-driven scheduler. 

Out of this simulation, several statistics are gathered: 

• On each queue, put and get counts are observed, as well 
as signal statistics (minimum and maximum values) . The 
signal statistics provide an idea of the required 
buswidths of communication busses. 

• The scheduler counts the firings per actor, and operation 
executions (+, *, ...) per actor. This profiling helps 
the designer in deciding cycle budgets and hardware 
operator allocation for each actor. 

These statistics are gathered through a C++ operator 
overloading mechanism, so the designer gets them for free 
if he uses the appropriate C++ objects (schedule, queue and 
token class types) for simulation. 

We are next interested in the detailed clock-cycle true 
behavior of the actors and the required storage and 
handshake protocol circuits on the communication busses. 
This is the topic of the next step, the actor definition. 

Actor definition 

The actor definition is based on two elements: 

• Signal-f lowgraph representation of behavior. 

• Time-verification of the system. 
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The two problems can be solved independently using the 
annotated MATLAB code as specification. In OCAPI : 

• The actor RT modeling proceeds in C++ and can be freely 
5 intermixed with high level descriptions regarding both 

operator wordlength effects and clock-cycle true timing. 

• The time-verification approach allows the system 
feasibility to be checked at all times by warning the 
designer for deadlock and/or causality violations of the 

10 communication. 

Signal flowgraph definition 

Within the OCAPI design flow, a class library was developed 
15 to simulate behavior at RT-level. It allows 

• To express the behavior of an algorithm with arbitrary 
implementation parallelism by setting up an signal flow 
graph (SFG) data structure. 

• To simulate the behavior of an actor at a clock-cycle 
20 true level by. interpreting this SFG data structure with 

instantiated token values, 

• To specify wordlength characteristics of operations 
regarding sign, overflow and rounding behavior. Through 
explicit modeling of the quantization characteristic 

25 rather than the bit -vector representation (as in SPW) , 
efficient simulation runtimes are obtained. 

• To generate C++ code for this actor, and hence perform 
the clock cycle true simulation with compiled code. 

• To generate VHDL code for this actor, and synthesize an 
30 implementation with Synopsys DC. 

• To generate DSFG code for this actor, and synthesize an 
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implementation with Cathedral -3 . It was observed that 
Cathedral-3 performs a better job with relation to both 
critical path and area of the obtained circuits than 
Synopsys DC. The best synthesis results are obtained by 
5 first using Cathedral-3 to generate a circuit at gate 
level and then Synopsys-DC to perform additional logic 
optimization as a postprocessing. 

An important observation was made regarding simulation 
10 speed. For equivalent descriptions at different 
granularities, the following relative runtimes were found; 

• 1 for the MATLAB simulation. 

• 2 for the untimed, high level C++ data flow description. 

• 4 for the timed, fixed point C++ description (compiled 
15 code) . 

• 40 for the procedural, word-level VHDL description. 

It is thus concluded that RT-modeling of systems within 
OCAPI is possible within half an order of magnitude of the 

20 highest level of description. VHDL modeling however, is 
much slower. Currently the figure of 40 times MATLAB is 
even considered an under-estimate . Future clock-cycle based 
VHDL simulators can only solve half of this problem, since 
they still use bit-vector based simulation of tokens rather 

25 then quantization based simulation. 

Next, the modeling issues in C++ are shown in more detail. 
The C++ signal -flowgraph representation uses a signal data- 
type, that can be either a registered or else an immediate 
30 value. With this data- type, expressions are formed using 
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the conventional scalar operations. {+, *, shifts and 

logical operations) . Expressions are grouped together in a 
signal flowgraph. A signal flowgraph interfaces with the 
system through the data-flow simulation queues. Several 
5 signal-f lowgraphs can be grouped together to a SFG- 
sequence. A SFG sequence is an expression of behavior that 
spans several cycles. The specification is done through a 
finite state machine model, for which transition conditions 
can be expressed. The concept of SFG modeling is pictured 
10 in figure 7. 

The combination of different SFG's in combination with a 
finite state machine make up the clock- cycle true actor 
model. Within the actor, SFG communication proceeds through 
15 registered signals. Communication over the boundaries of an 
actor proceeds through simulation queues. 

When the actor is specified in this way, and all signal 
wordlengths are annotated to the description, an automated 

20 path to synthesis is available. Several different SFG's can 
be assigned to one datapath. Synthesizable code is 
generated in such a way that hardware sharing between 
different sfg's is possible. A finite state machine (FSM ) 
description is first translated to SFG format to generate 

25 synthesizable code in the same way. There is an implicit 
hierarchy available with this method: by assigning 
different FSM-SFG's to one datapath, an overall processor 
architecture is obtained that again has a mode port and 
therefore looks like a (multicycle) datapath. For macro 

30 control problems (such as acquisition/ tracking algorithm 
switching in modems) , this is a necessity. 
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Although the distance between the annotated MATLAB level 
and this RT-level SFG seems large, it is reasonable on the 
actor level . Consider for example 



10 



actor3 { 

% find correlation peak 
[max, maxpos] = max (ot) ; 
1©1 5 

} 

We are asked here to write time the mcixO operation with an 
SFG. actor2 has scaled the parallelism of ot to 5®1, 
A solution is presented in actual C++ code. 



15 { 

FB qin(*'qin") ; 
FB qlout(^'qout") ; 
FB q2out ( * ' qout " ) ; 
FB start ( ^ 'start" ) ; 

20 



//input queue 
//output queue 
/ /output queue 
//the start pin of the 
processor 



clock ck ; 

_sig currmax(ck,df ix(0) ) 

_sig mcucpos (ck,df ix(0) ) ; 

_sig currpos (ck,df ix(0) ) ; 
_sig inputvalue ; 
_sig maxout ; 
__Big maxposout ; 
_sig one(dfix(l)) ; 



//registry holding current 

maximum 
//registry holding position 

of max 
// current position 
//holds input values 

//a constant 
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SFG sfgO, sfgl,sfg2 ; 



//we use 3 sfg's 
//code after this is for sfgO 



sfgO • starts 0 ; 
5 currmax = inputvalue ; 
xoaxpos = one ; 
currpos = one ; 

//next, give sfgO a mode and 
an input queue 
10 sfgO «"inO"«ip (inputvalue, qin) ; 



sf gl. starts () ; 



//code after this is for sfgl 
//this is a conditional 
assignment 

15 currmax= (inputvalue >currmax) .cassign( inputvalue, currmax) ; 
maxpos = (inputvalue > currmax) .cassign (currpos, maxpos) ; 
currpos = currpos + 1 ; 
sfgl <<" ml "<<ip( inputvalue, qin) ; 



20 sfg2. starts () ; //the last SFG 

maxposout= (inputvalue>currmax) . cassign (_sig(dfix (4) ) , maxpos) 
maxouts (inputvalue>currmax) .cassign (inputvalue, currmax) ; 
sfg2 «''m2"<< op (maxout, qout) « op (maxposout, q2out) ; 



state 


sO(*'sO")r 


sl(* 


'si"), s2(* 


's2 


sO » 


lend (start) 


>> 




sO 


sO » 


end (start) 


» 


sfgO 


si 


si >> 


allways 


>> 


sfgl » 


s2 


s2 » 


allways 


>> 


sfgl » 


s3 


s3 >> 


allways 


>> 


Bfg2 » 


sO 
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As an aid to interpret the C++ code, the equivalent 
behavior is shown in figure* 8. The behavior is modeled as a 
4-cycle description. Three SFG's (13,14,15) are needed, in 
addition to a 4 -state controller (16) . The controller is 
5 modeled as a Mealy machine. 

The C++ description also illustrates some of the main 
contributions of OCAPI : register-transfer level aspects 
(signals, clocks, registers) , as well as dataflow aspects 
10 simulation queues) are freely intermixed and used as 
appropriate- By making use of C++ operator overloading and 
classes, these different design concepts are represented in 
a compact syntax format. Compactness is a major design 
issue . 

15 Having this specification, we have all information to 
proceed with the detailed architectural design of the 
actor. This is however only part of the system design 
solution: we are also interested in how to incorporate the 
cycle- true result in the overall system. 

20 

Time verification 

The introduction of time (clock cycles) in the simulation 
uses an expectation-based approach. It allows to use either 
25 a high level or else an SFG-type description of the actor, 
and simulate the complete system clock-cycle true. The 
simulation helps the designer in finding whether his 'high- 
level' description matches the SFG description, and 
secondly, whether the system is realizable. 

30 

A summary of the expectation based simulation is given in 
figure 10 and is used to illustrate the ideas mentioned 
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below. 

This is a different approach then when analysis is used 
(e.g. the evaluation of a compile- time schedule and token 
lifetimes) to force restrictions onto the actor 
5 implementation. This traditional approach gives the 
designer no clue on whether he is actually writing down a 
reasonable description. 

Each token in the simulation is annotated with a time when 
10 it is created: the token age. Initial tokens are born at 
age 0, and grow older as they proceed through the dataflow 
n graph- The unit of time is the clock cycle. 

Additionally, each queue in the simulation holds a queue 

tlx 

SJ age (say, * the present*) that is used to check the 

J^^ 15 causality of the simulation: a token entering a queue 

yl should not be younger than this boundary. A queue is only 

" able to delay tokens (registers) , and therefore can only 

< y work with tokens that are older than the queue age. 

M 20 If such a consistency violation is detected, a warning 

message is issued and the token age is adapted to that of 
the queue. Otherwise, the time boundary of the queue is 
updated with the token age after the token is installed on 
the queue. 

25 

•i 

i The queue age is steered by the actor that drives it. For 

each actor the designer formulates an iteration time. The 
iteration time corresponds the cycle budget that the 
designer expects to need for the detailed actor 
30 description. Upon each actor firing, the queues driven by 
the actor are aged with the iteration time. 
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At the same time, the actor operations also increase the 
age of the tokens they process. For normal operations, the 
resulting token age is equal to the maximum of the operand 
token ages- For registered signals (only present in SFG- 
5 level actor descriptions) , the token age is increased by 
one. Besides aging by operation, aging inside of the queues 
is also possible by attaching a travel delay to each queue. 

Like the high-level actor description, a queue is also 
10 annotated with a number of expectations. These annotations 
reflect what the implementation of the queue as a set of 
communication busses should look like. 

A communication bus contains one or more registers to 
15 provide intermediate storage, and optionally also a 
handshake -protocol circuit. A queue then maps to one or 
more (for parallel communication) of these communication 
busses . 

20 The expectations for a simulation queue are : 

• The token concurrency, that expresses how many tokens of 
the same age can be present on one queue. To communicate 
a MATIiAB vector annotated with 8®2 for example requires 
two communication busses. This is reflected in the high 

25 level queue model by setting the token concurrency to 
two. 

• In case the token concurrency is 1, it can be required 
that subsequent tokfens are separated by a determined 
number of clock cycles. In combination with the travel 

30 delay, this determines how many registers are needed on a 
communication bus. This expectation is called the token 
latency. 
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Example implementations for different expectations are 
shown in figure 9. 



5 When the token concurrency is different from one, the token 
latency cannot be bigger than one. If it would, then the 
actor that provides the tokens can be designed more 
effectively using hardware sharing, and thus reducing the 
token concurrency. 
10 A summary of the expectation based simulation is put as 
follows. First, there are several implicit adaptations to 
token ages and queue ages. 

• An actor description increases the queue age upon each 
actor iteration with the iteration time. 

15 • A queue increases the age of communicated tokens with the 
travel delay. 

• An SFG description increases token ages through the 
operations. The token age after a register is increased 
by one, all other operations generate a token with age 

20 equal to the maximum of the operand ages. 

The set of operations that modify the token age are 
referred to as token aging rules. 

25 Next, a number of checks are active to verify the 
consistency of the simulation. 



30 



• A token age cannot be younger (smaller) then a queue age. 

• The token concurrency on a queue cannot be exceeded. 

• The token latency on a queue cannot be exceeded. 
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A successful clock-cycle true simulation should never fail 
any of these checks. In the case of such success, the 
expectations on the queue can be investigated more closely 
to devise a communication bus for it. In this description 
5 we did not mention the use of handshake protocol circuits. 
A handshake protocol circuit can be used to synchronize 
tokens of different age at the input of an actor. 
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Implementation 

10 

The current library of OCAPI allows to describe a system in 
C++ by building on a set of basic classes, 

• A simulation queue class that transports a token class 
15 and allows to perform expectation-checks. 

• An SFG/FSM class that allows clock cycle true 
specification, simulation and code generation. 

• A token class that allows to simulate both floating 
point -type representation and fixed point type 

20 representation. 

One can simulate the MATLAB data-vector data-type with C++ 

simulation queues. For the common MATLAB operations, one 

can develop a library of SFG descriptions that reflect 
25 different flavors of parallelism. For instance, a C++ 

version of the description 

% input data 

in = [1 2 1 3 3 4 1 2] ; 

% spreading code 
30 c = [1 -1 1 -1] ; 

% correlate 

ot = corr (in, c) 
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% find correlation peak 
[max, maxpos] = max (ot) ; • 

looks, after scaling of the parallelism and defining the 
actor boundaries, like 
5 FB in, ot, maxp ; 

in. delay (1,0) ; //iteration time, travel delay 

ot.delay{l,0) ; 
maxp.delay (4, 0) ; 

10 

in. expect (1,1) ; //travel time, concurrency, 

latency 

ot. expect (1, 1) ; 
maxp. expect (1, 4) ; 

15 

in = vectord, 2, 1, 3, 3, 4, 1, 2) ; 

ot = corr(8, 4, in, vectord, -1, 1, -D) 

maxp = maxpos (4, ot) ; 

20 This C++ description contains all information necessary to 
simulate the system in mind at clock cycle true level and 
to generate the synthesizable code for the system and the 
individual actors. 

25 Thus, the data-flow level has become transparent - it is 
not explicitly seen by the designer but rather it is 
implied through the expectations (pragma's) and the 
library. 

30 Example 2; design of a 4 -tap correlator processor 



An example of processor design is given next to experience 
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hardware design when using OCAPI . 

The task is to design a 4 -tap correlator processor that 
evaluates a correlation value each two cycles. One 
coefficient of the correlation pattern needs to be 
5 programmable and needs to be read in after a control signal 
is asserted. The listing in figure 11 gives the complete 
FSMD model of this processor. 

The top of the listing shows how types are declared in 
OCAPI. For example, the type T_sainple is 8 bits wide and 
10 has 6 bits beyond the binary point. 

For such a type declaration, a signed, wrap-around and 
truncating representation is assumed by default. This can 
be easily changed, as for instance in 

15 // floating point 
dfix T_sainple ; 

//unsigned 

dfix T_sample(8, 6/ ns) ; 

20 

//unsigned, rounding 

dfix T_sainple(8, 6, ns, rd) ; 

Below the type declarations we see coefficient 
25 declarations. These are specified as plain double types, 
since they will be automatically quantized when read in 
into the coefficient registers. It is possible to intermix 
existing C/C++ constructs and types with new ones. 
Following the coefficients, the FSMD definition of the 
30 correlator processor is shown. This definition requires: 
the specification of the instruction set that is processed 
by this processor, and a specification of the control 



142 

behavior of the processor. For each of these, OCAPI uses 
dedicated objects. 

First, the instruction set is defined. Each instruction 
performs data processing on signals, which must be defined 
first. The definitions include plain signals (sainple_in and 
corr_out) , registers (accu) , and register arrays (coef[] 
and sample [] } . 

Next, each of the instructions are defined. A definition is 
started by creating a SFG object. All signal expressions 
that come after such an SFG definition are considered to 
make up part of it. A SFG definition is closed simply by 
defining a new SFG object. 

The first instruction, initialize_coef s, initializes the 
coefficient registers coef[]. The for loop allows to 
express the initialization in a compact way. Thus, the 
initialize_coef s instruction is also equivalent to 



coef [0] = W(T_coef, 

coef [1] = W(T_coef, 

coef (23 = W(T_coef, 

coef [3] = W(T coef. 



hardwired_coef [0] ) 
hardwired_coef [1] ) 
hardwired_coef [2] ) 
hardwired coef [3] ) 



The second instruction programs the value of the first 
coefficient. The new value, coef_in, is read from an input 
port of the FSMD with the same name. Beyond this port, we 
are 'outside' of the timed FSMD description and use 
dataflow semantics, and communicate via queues. 
The third and fourth instruction, correl_l and correl_2 
describe the two phases of the correlation. It is very easy 
to express complex expressions just by using C++ operators. 
Also, a cast operation is included that limits the 
precision of the intermediate expression result. Although 
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this is for minor importance for simulation, it has strong 
influence on the hardware synthesis result. 

The instruction read_sample shifts the data delay line. In 
addition to a for loop, an if expression is used to express 
5 the boundary value for the delay line. Use of simple C++ 
constructs such as these allow to express signal flow graph 
structure in a compact an elegant way. It is especially 
useful in parametric design. 

The last instruction, read_control, reads in the control 
10 value that will decide whether the first correlation 
coefficient needs to be refreshed. 

Below all SFG definitions, the control behavior of the 
correlator processor is described. An FSM with tree states 
is defined, using one initial state rst, and two normal 

15 states phase_l and phase_2 . Next, four transitions are 
defined between those three states. Each transition 
specifies a start state, the transition condition, a set of 
instructions to execute, and a target state. For a designer 
used to finite state machine specification, this is a very 

20 compact and efficient notation. 

The transition condition always is always true, while a 
transition condition like end (load) will be true whenever 
the register load contains a one. 

The resulting fsm description is returned to OCAPI by the 
25 last return statement. The simulator and code generator can 
now process the object hierarchy in order to perform 
semantical checks, simulation, and code generation. 
The translation to synthesizable VHDL and Cathedral -3 code 
is automatic and needs no extra designer effort. The 
30 resulting circuit for datapath and controller is shown in 
figure 12. The hierarchy of the generated code that is 
provided by OCAPI is also indicated. Each controller and 
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datapath are interlinked using a link cell. The link cell 
itself can be embedded into an automatically generated 
testbench or also in the system link cell that 
interconnects all components. 

Example 3 ; design of Complex High Speed ASICs 



The design of a 75 Kgate DECT transceiver is used as 
10 another example (figure 13) . 

The design consists of a digital radiolink transceiver 
ASIC, residing in a DECT base station (20) (figure 13) . The 
chip processes DECT burst signals, received through a radio 

15 frequency front-end RF (21) . The signals are equalized (22) 
to remove the multipath distortions introduced in the radio 
link. Next, they are passed to a wire-link driver DR (23), 
that establishes communication with the base station 
controller BSC (24) . The system is also controlled locally 

20 by means of a control component CTL (25) . 

The specifications that come with the design of the digital 
transceiver ASIC in this system are as follows: 



25 • The equalization involves complex signal processing, and 
is described and verified inside a high level design 
environment such as MATLAB. 

• The interfacing towards the control component CTL and the 
wire- link driver DR on the other hand is described as a 

30 detailed clock-cycle true protocol. 

• The allowed processing latency is, due to the real time 
operation requirements, very low: a delay of only 29 DECT 
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symbols (25.2 /iseconds) is allowed. The complexity of the 
ecjualization algorithm, on the other hand, requires up to 
152 data multiplies per DECT symbol to be performed. This 
implies the use of parallel data processing, and 
5 introduces a severe control problem. 

• The scheduled design time to arrive from the 
heterogeneous set of specifications to the verified gate 
level netlist, is 18 person-weeks. 

10 The most important degree of freedom in this design process 
is the target architecture, which must be chosen such that 
the requirements are met. Due to the critical design time, 
a maximum of control over the design process is required. 
To achieve this, a programming approach to implementation 

15 is used, in which the system is modelled in C++. The object 
oriented features of this language allows to mix high-level 
descriptions of undesigned components with detailed clock- 
cycle true, bit-true descriptions. In addition, appropriate 
object modelling allows the detailed descriptions to be 

20 translated to synthesizable HDL automatically. Finally, 
verification testbenches can be generated automatically in 
correspondence with the C++ simulation. 

The result of this design effort is a 75 Kgate chip with a 
25 VLIW architecture, including 22 datapaths, each decoding 
between 2 and 57 instructions, and including 7 RAM cells. 
The chip has a 194 die area in 0.7 CMOS technology. 

The C++ programming environment allows to obtain results 
30 faster then existing approaches. Related to register 
transfer design environments such as , it will be shown 
that C++ allows to obtain more compact, and consequently 
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less error prone descriptions of hardware. High level 
synthesis environments could solve this problem but have to 
fix the target architecture on beforehand. As will be 
described in the case of the DECT transceiver design, 
5 sudden changes in target architecture can occur due to hard 
initial requirements, that can be verified only at system 
implementation. 

First, the system machine model is introduced This model 
includes two types of description: high-level untimed ones 
and detailed timed blocks. Using such a model, a simulation 
mechanism is constructed. It will be shown that the 
proposed approach outperforms current synthesis 
environments in code size and simulation speed. Following 
this, HDL code generation issues and hardware synthesis 
strategies are described. 

System Machine Model 

Due to the high data processing parallelism, the DECT 
transceiver is best described with a set of concurrent 
processes. Each process translates to one component in the 
final system implementation. 

25 At the system level, processes execute using data flow 
simulation semantics. That is, a process is described as an 
iterative behavior, where inputs are read in at the start 
of an iteration, and outputs are produced at the end. 
Process execution can start as soon as the required input 

30 values are available. 




Inside of each process, two types of description are 



.1 
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possible. The first one is a high level description, and 
can be expressed using procedural C++ constructs. A firing 
rule is also added to allow dataflow simulation . 

5 The second flavour of processes is described at register 
transfer level. These processes operate synchronously to 
the system clock. One iteration of such a process 
corresponds to one clock cycle of processing. 

10 For system simulation, two schedulers are available. A 
dataflow scheduler is used to simulate a system that 
contains only untimed blocks. This scheduler repeatedly 
checks process firing rules, selecting processes for 
execution as their inputs are available. 

15 

When the system also contains timed blocks, a cycle 
scheduler is used instead. The cycle scheduler manages to 
interleave execution of multi-cycle descriptions, but can 
incorporate untimed blocks as well. 

20 

Figure 14 shows the front-end processing of the DECT 
transceiver, and the difference between data-flow and cycle 
scheduling. At the top, the front-end processing is seen. 
The received signals are sampled by and A/D, and correlated 

25 with a unique header pattern in the header correlator HCOR. 
The resulting correlations are detected inside a header 
detector block HDET. A simulation with high level 
descriptions uses the dataflow scheduler. An example 
dataflow schedule is seen in the middle of the figure. The 

30 A/D high level description produces 3 tokens, which are put 
onto the interconnect communication queue. Next, the 
correlator high level description can be fired three times. 
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followed by the detector processing. 

When a cycle true description of the A/D and header 
correlator on the other hand is available, this system can 
5 be simulated with the cycle scheduler as shown on the 
bottom of the figure. This time, behavior of the A/D block 
and correlator block are interleaved. As shown for the HCOR 
block, executions can take multiple cycles to perform. The 
remaining high level block, the detector, contains a firing 
10 rule and is executed as required. Related to the global 
clock grid, it appears as a combinatorial function. 

Detailed process descriptions reflect the hardware behavior 
of a component at the same level of the implementation. To 
15 gain simulation performance and coding effort, several 
abstractions are made. 

Finite Wordlength effects are simulated with a C++ fixed 
point library. It has been shown that the simulation of 
20 these effects is easy in C++ . Also, the simulation of the 
quantization rather than the bitvector representation 
allows significant simulation speedups . 

The behavior is modelled with a mixed control /data 
processing description, under the form of a finite state 

25 machine coupled to a datapath. This model is common in the 
synthesis community. In high throughput telecommunications 
circuits such as the ones in the DECT transceiver ASIC, it 
most often occurs that the desired component architecture 
is known before the hardware description is made. The FSMD 

30 model works well for these type of components. 



The two aspects, wordlength modelling and cycle true 
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modelling, are available in the programming environment as 
separate class hierarchies. Therefore, fixed point 
modelling can be applied equally well to high level 
descriptions . 

5 

As an illustration of cycle true modelling, a part of the 
central VLIW controller description for the DECT 
transceiver ASIC is shown in figure 15. The top shows a 
Mealy type finite state machine (30) . As actions, the 

10 signal flowgraph descriptions (31) below it are executed. 
The two states execute and hold correspond to operational 
and idle states of the DECT system respectively. The 
conditions are stored in registers inside the signal 
flowgraphs. In this case, the condition holdrequest is 

15 related to an external pin. 

In execute state, instructions are distributed to the 
datapaths. Instructions are retrieved out of a lookup 
table, addressed by a program counter. When holdrequest is 
asserted, the current instruction is delayed for execution, 
and the program counter PC is stored in an internal 
register. During a hold, a nop instruction is distributed 
to the datapaths to freeze the datapath state. As soon as 
holdrequest is removed, the stored program counter holdpc 
addresses the lookup table, and the interrupted instruction 
is issued to the datapaths for execution. 

Signals and Signal Flow Graphs 

30 Signals are the information carriers used in construction 
of a timed description. Signals are simulated using C++ sig 
objects. These are either plain signals or else registered 



20 



25 
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signals. In the latter case the signals have a current 
value and next value, which is accessed at signal reference 
and assignment respectively- Registered signals are related 
to a clock object elk that controls signal update. Both 
5 types of signals can be either floating point values or 
else simulated fixed point values. 

Using operations, signals are assembled to expressions. By 
using the overloading mechanism as shown in figure 16, the 
10 parser of the C++ compiler is reused to construct the 
signal flowgraph data structure. 

An example of this is shown in figure 17 . The top of the 
figure shows a C++ fragment (40) . Executing this yields the 
15 data structure (41) shown below it. It is seen that 

• the signal flowgraph consists both of user defined nodes 

and operation nodes. Operation nodes keep track of their 

operands through pointers. The user defined signals are 

atomic and have null operand pointers. 
20 • The assignment operations use reversed pointers allowing 

to find the start of the expression tree that defines a 

signal . 

A set of sig expressions can be assembled in a signal flow 
25 graph (SFG) . In addition, the desired inputs and outputs of 
the signal flowgraph have to be indicated- This allows to 
do semantical checks such as dangling input and dead code 
detection, which warn the user of code inconsistency. 

30 An SFG has well defined simulation semantics and represents 
one clock cycle of behavior. 
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Finite State Machines 

I 

After all instructions are described as SFG objects, the 
control behavior of the component has to be described. We 
5 use a Mealy- type FSM model to do this. 

Again, the use of C++ objects allow to obtain very compact 
and efficient descriptions. Figure 18 shows a graphical and 
C++-textual description of the same FSM. The correspondence 

10 is obvious. To describe an equivalent FSM in an event 
driven HDL, one usually has to follow the HDL simulator 
semantics, and for example use multi -process modelling. By 
using C++ on the other hand, the semantics can be adapted 
depending on the type of object processed, all within the 

15 same piece of source code. 

Architectural Freedom 

An important property of the combined control/data model is 
20 the architectural freedom it offers. As an example, the 
final system architecture of the DECT transceiver is shown 
in figure 19. It consists of a central (VLIW) controller 
(50) , a program counter controller (51) and 22 datapath 
blocks. Each of these are modelled with the combined 
25 control/data processing shown above. They exchange data 
signals that, depending on the particular block, are 
interpreted as instructions, conditions or signal values. 
By means of these interconnected FSMD machines, a more 
complex machine is constructed. 

30 

It is now motivated why this architectural freedom is 
necessary. For the DECT transceiver, there is a severe 
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latency requirement. Originally, a dataflow target 
architecture was chosen (fdgure 20) , which is common for 
this type of telecommunications signal processing. In such 
an architecture, the individual components are controlled 
5 locally and data driven. For example, the header detector 
processor signals a DECT header start (a correlation 
maximum) , as soon as it is sure that a global maximum is 
reached. 

Because of the latency requirement however, extra delay in 
10 this component cannot be allowed, and it must signal the 
first available correlation maximum as a valid DECT header. 
In case a new and better maximum arrives, the header 
detector block must then raise an exception to subsequent 
blocks to indicate that processing should be restarted. 
15 Such an exception has global impact. In a data driven 
architecture however, such global exceptions are very 
difficult to implement. This is far more easy in a central 
control architecture, where it will take the form of a jump 
in the instruction ROM. Because of these difficulties, the 
20 target architecture was changed from data driven to central 
control. The FSMD machine model allowed to reuse the 
datapath descriptions and only required the control 
descriptions to be reworked. This architectural change was 
done during the 18-week design cycle. 

25 

The Cycle Scheduler 

Whenever a timed description is to be simulated, a cycle 
scheduler is used instead of a dataflow scheduler. The 
30 cycle scheduler creates the illusion of concurrency between 
components on a clock cycle basis. 
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The operation of the cycle scheduler is best illustrated 
with an example. In figure '21, the simulation of one cycle 
in a system with three components is shown. The first two, 
components 1 (60) and 2 (61) , are timed descriptions 
5 constructed using fsm and sfg objects. Component 3 (62) on 
the other hand is decribed at high level using a firing 
rule and a behavior. In the DECT transceiver, such a loop 
of detailed (timed) and high level (untimed) components 
occurs for instance in the RAM cells that are attached to 
10 the datapaths. In that case, the RAM cells are described at 
high level while the datapaths are described at clock cycle 
true level . 

The simulation of one clock cycle is done in three phases. 
15 Traditional RT simulation uses only two; the first being an 
evaluation phase, and the second being a register update 
phase . 

The three phases used by the cycle scheduler are a token 
20 production phase, an evaluation phase and a register update 
phase . 

The three-phase simulation mechanism is needed to avoid 
apparent deadlocks that might exist at the system level. 
Indeed, in the example there is a circular dependency in 

25 between components 1, 2, and 3, and a dataflow scheduler 
can no longer select which of the three components should 
be executed first. In dataflow simulation, this is solved 
by introducing initial tokens on the data dependencies. 
Doing so would however require us to devise a buffer 

30 implementation for the system interconnect, and introduce 
an extra code generator in the system. 
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The cycle scheduler avoids this by creating the required 
initial tokens in the token production phase. Each of the 
phases operates as follows. 

5 [0] Each the start of clock cycle, the sfg descriptions to 
be executed in the current clock cycle are selected. In 
each fsm description, a transition is selected, and the 
sfg related to this transition are marked for execution. 
[1] Token production phase. For each marked sfg, look into 

10 the dependency graph, and identify the outputs that 
solely depend on registered signals and/or constant 
signals. Evaluate these outputs and put the obtained 
tokens onto the system interconnect . 
[2] (a) Evaluation phase (case a) . In the second phase, 

15 schedule marked sfg and untimed blocks for execution 
until all marked sfg have fired. Output tokens are 
produced if they are directly dependent on input tokens 
for timed sfg descriptions, or else if they are outputs 
of untimed blocks. 

20 [2] (b) Evaluation phase (case b) . Outputs that are however 
only dependent on registered signals or constants will 
not be produced in the evaluation phase. 
[3] Register update phase. For all registered signals in 
marked sfg, copy the next-value to the current -value . 

25 

The evaluation phase of the three-phase simulation is an 
iterative process. If a pre-set amount of iterations have 
passed, and there are still unfired components, then the 
system is declared to be deadlocked. This way, the cycle 
30 scheduler identifies combinatorial loops in the system. 

Code Generation and Simulation Strategy 
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The clock-cycle true, bit-true description of system 
components serves a dual purpose. First, the descriptions 
have to be simulated in order to validate them. Next, the 
5 descriptions have also to be translated to an equivalent, 
synthesizable HDL description. 

In view of these requirements, the C++ description itself 
10 can be treated in two ways in the programming environment . 
In case of a compiled code approach, the C++ description is 
translated to directly executable code. In case of an 
interpreted approach, the C++ description is preprocessed 
by the design system and stored as a data structure in 
15 memory. 

Both approaches have different advantages and uses. For 
simulation, execution speed is of primary importance . 
Therefore, compiled code simulation is needed. On the other 
20 hand, HDL code generation requires the C++ description to 
be available as a data structure that can be processed by a 
code generator. Therefore, a code generator requires an 
interpreted approach. 

25 We solve this dual goal by using a strategy as shown in 
figure 22. The clock-cycle true and bit-true description of 
the system is compiled and executed. The description uses 
C++ objects such as signals and finite state machine 
descriptions which translate themselves to a control/data 

30 flow data structure. 

This data structure can next be interpreted by a simulator 
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for quick verification purposes. The same data structure is 
also processed by a code generator to yield two different 
descriptions . 

5 A C++ description can be regenerated to yield an 
application-specific and optimized compiled code simulator. 
This simulator is used for extensive verification of the 
design because of the efficient simulation runtimes. 
A synthesizable HDL description can also be generated to 
10 arrive at a gate- level implementation. 

The simulation performance difference between these three 
formats (interpreted C++ objects, compiled C++, and HDL) is 
illustrated in table 1. Simulation results are shown for 
15 the DECT header correlator processor, and also the complete 
DECT transceiver ASIC. 

The C++ modelling gains a factor of 5 in code size (for the 
interpreted-object approach) over RT-VHDL modeling. This is 
20 an important advantage given the short design cycle for the 
system- Compiled code C++ on the other hand provides faster 
simulation and smaller process size then RT-VHDL. 

For reference, results of netlist- level VHDL and Verilog 
25 simulations are given. 



Design 


Size 
(Gates) 


Type 


Source 
Code 

(# lines) 


Simulation 

Speed 

(cycles/s) 


Process 

Size 

(Mb) 


HCOR 


6K 


C++ (interpreted 
obj) 


230 


69 


3.8 
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C++ (compiled) 


1700 


819 


2 . 7 


VHDL (RT) 


1600 


251 


11.9 


VHDL (Netlist) 


77000 


2.7 


81 .5 


DECT 


75K 


C++ (interpreted 
obj) 


8000 


2.9 


20 


C++ (compiled) 


26000 


60 


5.1 


Verilog 
(Netlist) 


59000 


18.3 


100 



Table 1. 



Synthesis Strategy 

5 Finally, the synthesis approach that was used for the DECT 
transceiver is documented. As shown in figure ID, the 
clock-cycle true, bit-true C++ description can be 
translated from within the programming environment into 
equivalent HDL . 

10 

For each component, a controller description and a datapath 
description is generated, in correspondence with the C++ 
description. This is done because we rely on separate 
synthesis tools for both parts, each one optimized towards 
15 controller or else datapath synthesis tasks. 

For datapath synthesis, we rely on the Cathedral-3 back-end 
datapath synthesis tools , that allow to obtain a 
bitparallel hardware implementation starting from a set of 
20 signal flowgraphs. These tools allow operator sharing at 
word level, and result in run times less than 15 minutes 
even for the most complex, 57-instruction data path of the 
DECT transceiver. 
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Controller synthesis on the other hand is done by logic 
synthesis such as Synopsys DC. For pure logic synthesis 
such as FSM synthesis, this tool produces efficient 
results. The combined netlists of datapath and controller 
5 are also post-optimized by Synopsys DC to perform gate- 
level netlist optimizations. This divide and conquer 
strategy towards synthesis allows each tool to be applied 
at the right place. 

10 During system simulation, the system stimuli are also 
translated into testbenches that allow to verify the 
synthesis result of each component. After interconnecting 
all synthesized components into the system netlist, the 
final implementation can also be verified using a generated 

15 system testbench. 

gxampl^ 4 ; design of a QAM transmission system with OCAPI 
(figure 23) 

A QAM transmission system, that includes a transmitter, a 
20 channel model, and a receiver is designed. 

System Specification 

A system specification in OCAPI is an executable model: an 
25 executable file, that can be run as a software program on a 
computer. The principle of executable specification, as it 
is called, is very important for system design. It allows 
one to check your specification using simulations. In this 
case, we are designing a QAM transmission system. A full 
30 communications system contains a transmitter, a channel 
model, and a receiver. The ensemble of the transmitter, 
channel model and receiver organized as an executable 
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specification is also called an end-to-end executable 
specification. The teim end-to-end clearly indicates that 
the simulation starts with a user message, and ends with a 
(received) user message. In between, the complete digital 
transmission is modeled, as shown in figure 23. 
In this text, the complete transmission system will be 
developed. The development of a component in such a system 
is never a one-shot process. Rather, development proceeds 
through a design flow: a collection of subsequent design 
levels connected by 'natural' design tasks. For a modem, 
the typical design levels are: 

- a statistical level, to do high level explorations of 
algorithms. In OCAPI, this level is called the link 
level . 

a functional level, to assemble selected algorithms into 
a single operational modem. In OCAPI, this level is 
called the algorithm level. 

- a structural level, to represent the modem as a machine 
that executes a functional description. In OCAPI, this 
level is called the architecture level. Each of these 
levels has an own set of requirements. Statistical 
requirements can be for example a bit error rate or a 
cell loss ratio. Functional requirements are for 
instance the set of modulation schemes to support. 
Finally, structural requirements are requirements like 
type of interfaces, or preselected architectures. 

Arranging the requirements besides the design levels yields 
the design flow, as shown in figure IB. The dashed box 
contains the levels that will be coded in C+ + -OCAPI. The 
upper level (the statistical one) is described in a 
language like Matlab. It is not included in this text: We 
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will start from a complete functional specification. The 
functional specification is. given herebelow in part A. 



Design Flow in OCAPJ-C++ 

Overall Design Flow 

A design flow with OCAPI looks, from a high level point 
of view, as shown in figure IC. The initial 
specification is an architecture model, constructed in 
C++. Through the use of refinement, we will construct 
an architecture model out of it. Next, relying on code 
generation, we obtain a synthesizable architecture 
model. This model can be converted to a technology- 
mapped architecture in terms of gates. OCAPI is 
concerned with the C++ layers of this flow, an in 
addition takes care of code generation issues. 

Algorithmic Models 

The algorithmic models in OCAPI use the dataflow 
computational model. The construction of this code by 
small examples selected out of Part B (below) is 
discussed. 

First, we consider the construction of an actor. An 
actor is a subalgorithm out of a dataflow system model. 
In OCAPI, each actor is defined by one class. As an 
example of actor definition, we take the diffenc block 
out of the transmitter. The include file (3.3) defines 
a class diffenc (line 10) that inherits from a base 
class. This inheritance defines the class under 
definition as a dataflow actor. The dataflow actor 
defines a constructor, a run method and a reset method. 
The run method (line 25) is the method that is called 
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when the actor should be executed. This method takes 
along parameters that , include the name (name), the I/O 
ports (__sym 1, _symb2)and other attributes (_qpsk, 
_diff_mode) . The type FB (Flow-Buffer) is the type of a 
FIFO queue. Looking at the implementation of run (??, 
line 26), we distinguish a firing rule in lines 29-30. 
The getSizeO method of a queue returns the number of 
elements in that queue. The firing rule expresses that 
the runO method should return whenever there is no 
data available in the queue. Otherwise, processing 
continues as described beyond line 32 (This processing 
is the implementation of the spec as described in Part 
A. 

A dataflow system is constructed out of such actors . 
The system code in 5.3 shows how the diffenc actor is 
instantiated (lines 57-61) . Besides actors, the system 
code also creates interconnect queues (lines 42-4 8) . By 
giving these as parameters in the constructor of 
actors, the required communication links are 
established. Besides the interconnection of actors, 
the system code also needs to create a scheduler. This 
scheduler will repeatedly test firing rules in the 
actors (by calling their runO method). The system 
scheduler that steers the differential encoder is shown 
on line 77 of 5.3. After this object is created, all 
dataflow actors that should be under control of it are 
"shifted into" it. The scheduler object has a method, 
runO, that tries firing all of the actors associated 
with the schedule just once. 

Architecture Models 
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An architecture model expresses the behavior of the 
algorithmic model in terms of operations onto hardware. 
The kind of hardware features that affect this depend 
of course on the target architectural style. OCAPI is 
5 intended for a bit-parallel, synchronous style. For 

this kind of style, two kinds of refinements are 
necessary: First, the data types need to be expressed 
in terms of fixed point numbers. Second, the execution 
needs to proceed in terms of clock cycles. The first 

10 kind of refinement is called fixed point modeling. The 

second kind is called cycle true modeling. These two 
refinements can be done in any order; for a complete 
architecture model, both are needed. We first give an 
example on how fixed point numbers are expressed in 

15 C++. Consider the ad block of the transmitter (3.2, 

line 24-27) . The purpose of this block is to introduce 
a quantization effect, such as for instance would be 
encountered when the signal passes through an analog- 
digital or digital-analog converter. In this case, the 

20 high level algorithmic model is constructed with a 

fixed point number in order to perform this 
quantization. On line 32, an object of type dfix 
(called indfix) is created. This object represents a 
fixed point value. The constructor uses three 

25 parameters. The first, '0', provides an initial value. 

The following two (W and L) are parameters that 
represent the wordlength and fractional wordlength 
respectively. The operation of the ad block is as 
follows. When there is information in the input queue, 

30 the value read is assigned to the fixed point number 

indfix. At the moment of assignment, quantization 
happens, . whether or not the input value was a floating 
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point value (The FIFO buffers are actually passing 
along objects of type dfix, so that floating as well as 
fixed point numbers can be passed from one block to the 
other) . A next example will show how cycle true 
5 modeling is done. We consider the derandomizer function 

of the receiver (6.4). First, looking at the 
algorithmic model (line 6 9-102), we see that the block 
reads two inputs (byte_in and syncro) and writes one 
output (byte_out) . In between, it performs some 

10 algorithmic processing (line 89-97) , The architecture 

model is shown in the define () function starting at 
line 116. The first few lines are type definitions and 
signal declarations. Next, four instructions are 
defined (line 143-179), and a controller which 

15 sequences these instructions is specified (line 184- 

195) . The architecture model makes heavily use of 
macros to ease the job of writing code. All of these 
are explained above. The goal of the define () function 
is to define an object hierarchy consisting of signals, 

20 expressions, states, etc ...that represents the cycle 

true behavior of a processor. At the top of the 
hierarchy is a finite state machine object. The member 
function fsm() (line 106) returns this object (which is 
a data member of the derandomizer class) . The system 

25 integration of the derandomizer (5.3, line 169-176) is 

the same for the algorithmic and architecture model. 
The selection between algorithmic and architecture 
model is done by giving a system scheduler either a 
base object (as in line 186) or else the fsm object for 

30 simulation (as in line 206) . Remember that the 

algorithmic model derives creates a class that derives 
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from the base object; while an architecture model 
defines a finite state machine object. 

Code Generation 

Finally we indicate the output of the code generation 
process- When an architecture model is constructed, 
several code generators can be used. OCAPI currently 
can generate RT-VHDL code directly, or else also 
Cathedral -3 dsfg code. When the member function 
generate {) of a system scheduler is called, Cathedral -3 
code will be produced, along with the required system 
link cells. The member function vhdlookO on the other 
hand produces RT-VHDL code. In this example, we have 
used the vhdlookO method (5.2, line 401). We consider 
the derandomizer block in the receiver. The first place 
where this appears in the generated code is in the 
system netlist (6.13, line 70 and line 143). Next, we 
can find the definitions of the block itself: its 
entity declaration (6.14), the RTL code (6.15), and a 
mapping cell from the fixed-point VHDL type FX to the 
more common VHDL type std_logic (6,16). By using this 
last mapping cell, we can also hook up the VHDL code 
for derand in a generated testbench (6.17). This 
testbench driver reads stimuli recorded during the C++ 
simulation and feeds them into the VHDL simulation. 

Part A: System Specification 
System Contents 



The end-to-end model of the QAM transmission system 
under consideration is shown in figure 23. It consists 
of four main components: 
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- A byte generator GEN (201) 

- A transmitter TX. (203) 

- A channel model CHAN. (205) 

- A receiver RX. (207) 

5 

The byte generator generates a sequence of random bytes . 
These are modulated inside of the transmitter to a QAM 
signal. The channel model next introduces distortions in 
the signal, similar to those occurring in a real channel. 
10 Finally, the receiver demodulates the signal, returning a 
decoded byte sequence. If no bit errors occur, then this 
sequence should be the same as the one created by the byte 
generator. 

Next, the detailed operation of the transmitter, the 
15 channel and the receiver is discussed. For the internal 
construction of a component, one might however still refer 
to figure 24. 
Tran smi tter Specification 

20 The Transmitter includes 

- rnd: A randomizer, which transforms a byte sequence into 
a pseudorandom byte sequence. This is done because of 
the more regular spectral properties of a rando mi zed 
(or 'whitened') byte sequence. 

25 - tuple; A tuplelizer, which chops the transmitted bytes 
into QAM/QPSK symbols. 

- diffenc: A differential encoder which applies 
differential encoding to the symbols. 

- map: A QAM symbol mapper, which translates QAM symbols 
30 to I/Q pulse sequence s. 
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- shape: A pulse shaper, which transforms the pulse 
sequences to a continous wave. In digital 
implementation, the temporal » continuity* is achieved by 
applying oversampling. 
5 - da: Finally, there is a block which applies quantization 
to the signal. This block simulates the effect of a 
digital-to-analog converter. 



The transmitter reads in a byte sequence, and randomizes 
10 this with a pseudorandom byte sequence. The sequence 
contains a synchronization word to align the receiver 
derandomizer to the transmitter randomizer. The 
pseudorandom sequence is generated by exoring a bitstream 
with a bitstream produced by a linear feedback shift 
15 register (LF SR) . The LFSR produces a bitstream according 
to the polynomial g(x) = 1 + + . It next feeds the 

bytes to a tuplelizer that generates symbols out of the 
byte sequence according to the following scheme. 
Given bits hi hS hS b4 b3 b2 bl bO, 

20 



Bit position 


QAM16 


QPSK 


b7 


I symbol 0 


I [1] symbol 0 


be 


Q symbol 0 


I [0] symbol 0 


b5 


I symbol 1 


Q[l] symbol 0 


b4 


Q symbol 1 


Q[0] symbol 0 


b3 


I symbol 2 


I [1] symbol 1 


b2 


Q symbol 2 


I [0] symbol 1 


bl 


I symbol 3 


Q[l] symbol 1 


bO. 


Q symbol 3 


Q[0] symbol 1 
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The symbols values are next fed to the differential encoder 
that generates a diff encoded symbol sequence: 
i=(((-{a " b)) & (a " glblstate) ) I ( (a " b) & (b " 
glbQstate) ) ) &1; 
5 q=(({-(a " b)) & (b " glbQstate) ) | ( (a " b) & (a " 
glblstate))) &1; 

with i and q the output msbs of the differentially encoded 
symbol; glblstate, glbQstate the previous values of i and 
q; and a and b the inputs msbs of the input symbol. The 
10 Isbs are left untouched (only for qaml6) The differentially 
encoded symbol sequence is next mapped to the actual symbol 
value using the following constellation for QPSK. 



QVal/Ival 


-3 


+3 


+3 


2 


0 


-3 


3 


1 



15 For QAM16, the following constellation will be used 



QVal/Ival 


-3 


-1 


1 


+3 


+3 


11 


9 


2 


3 


+1 


10 


8 


0 


1 


-1 


14 


12 


4 


6 


-3 


15 


13 


5 


7 



After mapping, the resulting complex sequence is pulse 
shaped. A RRC shaping filter with oversampling n = 4 is 
20 taken, with the rolloff factor set at r = 0.3. After pulse 
shaping, the sequence is upconverted to fc = fs/4 in the 
multiplexer block (included in the shaper) 
Chcumel Model Specification 
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The Channel Model contains 

- FIR filter with programmable taps. The filter is used 
to simulate linear distortions such as multipath 
effects . 

5 - Noise injection block. The incoming signal is fed into a 
20 tap filter. The second, third, fourth and 21th tap of 
the filter are programmable. Next a noise signal is 
added to the sequence. The noise distribution is 
gaussian; 

10 XI = sqrt(-21n*{Ul)) * cos(2*pi*U2) 

X2 = sqrt(-2ln*(Ul)) * sin(2*pi*U2) 

Ul, U2 are independent and uniform [0,1], 
XI and X2 are independent and N(0,1) 



15 



J?eceiver Specification 



The Receiver includes 

• Imsff A feed forward, T/4 spaced LMS Equalizer. 

20 • demap A demapper, translating a complex signal 

back to a QAM symbol . 

• detuple A detupler, glueing individual symbols back 
to bytes . 

• derand A derandomizer, translating the pseudonoise 
25 sequence back to an unrandomized sequence. 



It is not difficult to see that this signal processing 
corresponds to the reverse processing that was applied at 
the transmitter. The incoming signal is fed into an 
30 equalizer block. The 4 tap oversampled FF equalizer is 
initialized with a downconverting RRC sequence. This way. 
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the equalizer will act at the same time as a matched 
filter, a symbol timing recovery loop, a phase recovery 
loop, and an intersymbol- interference removing device. It 
is a simple solution at the physical synchronization 
5 problem in QAM. 

The equalizer is initialized as follows. Given the complex 
RRC 





tapO 


tapl 


tap2 


tap3 


I 


iO 


il 


12 


13 


Q 


qO 


qi 


q2 


q3 


then the LMS should be initialized with 




tapO 


tapl 


tap2 


tap3 


I 


iO 


0 


-12 


0 


Q 


0 


qi 


0 


-q3 



10 

The coefficient adaption algorithm of the equalizer is of 
the Least Mean Square type. This algorithm is decision 
directed; such algorithms are able to do tracking in a 

15 synchronization loop, but not to do acquisition 
(initialization) of the same loops. For simplicity in this 
example, we will however make abstraction of this 
acquisition problem. Next, the inverse operations of the 
transmitter are performed: the demodulated complex signal 

20 is converter to a QAM symbol in the demapper. The resulting 
QAM symbol stream is differentially decoded and assembled 
to a byte sequence in the detupler. The differential 
decoding proceeds according to 

a=({(-(i " q)) & (i ^ glblstate) ) | ( (i " q) & (q ^ 
25 glbQstate))) &1; b= ( ( (i ^ q) ) & (q ^ glbQstate) ) | ( (i ^ q) 
& (i ^ glblstate))) &1; 
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Finally, the pseudorandom encoding of the sequence is 
removed in the derandomizer . 

Part B: C++ code of the QAM system 

3 Transmitter Code 
3.1 tx/ad.h 

1 // ad.h 

2 // All rights reserved -- Imec 1998 

3 // @(#)ad.hl.2 03/20/98 
4 

5#infdef AD_H 
6#define AD_H 
7 

8#include "qlib.h" 
9 

10 class ad : public base{ 

11 FB *in; 

12 FB *ot; 

13 double*W; 

14 double*L; ; 
15 

16 public: 

17 ad (char *name, FB & _in,FB & _ot, doubled _W, double 
&_L) ; 

18 int runO ; 

19 int reset () ; 

20 }; 
21 

22#endif 
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3.2 tx/ad.cxx 

1 // ad.cxx 
5 2 // All rights reserved Imec 1998 

3 // ®{#)ad.cxx 1.4 03/31/98 
4 

5#include "ad.h" 
6 

10 7 ad: : ad (char* name, 

8 FB & _in, 

9 FB & __ot, 

10 double & _W, 

11 double & _L) : base (name) 

15 12 { 

13 in = _in.asSource (this) ; 

14 ot = _ot .asSink{this) ; 

15 W = &_W; 

16 L = &_L; 
20 17 } 

18 

19 int ad: : reset () { 

20 //return to initial state 

21 return 1; 
25 22 } 

23 

24 intad: :run{) { 
25 

26 //firing rule 
30 27 if (in->getSize 0 < 1) { 

28 return 0; 

29 } 
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30 

31 //core functionality 

32 dfix indf ix(0, (int) (*W) , (int) (*L) ) ; 

33 indfix= in->get( ) ; // inputting^ quantization 
5 assignment 

34 ot->put (indf ix) ; // outputing 
35 

36 return 1; 

37 } 
10 38 

3.3 tx/diffenc.h 

1 // diffenc.h 

2 // All rights reserved Imec 1998 

3 // ®(#)diffenc.h 1.7 98/03/31 
4 

5#infdef DIFFENC_H 
6#define DIFFENC_H 
7 

8#include "qlib.h" 
9 

10 class diffenc: public base{ 
11 

12 FB *symbl; 

13 FB *syTnb2; 

14 double *qpsk; 

15 double *diff _mode; 

16 int iState; 

17 int qState; 
18 

19 public: 



15 



20 



25 



30 
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20 diffenc(char *name, 

21 FB & _syTnbl, 

22 FB & _syTnb2, 

23 double &_qpsk, 

5 24 double &_dif f_mode) ; 

25 int run 0 ; 
2 6 int reset () ; 
27 }; 
28 

10 2 9#endif 

3.4 tx/dif fenc .cxx 

1 // dif fenc. cxx 
15 2 // All rights reserved -- Imec 1998 
3 // @(#)diffenc.cxx 1.8 98/03/31 
4 

5#include "diffenc.h" 
6 

20 7 dif fenc :: dif fenc (char*name, 



8 FB & _syTnbl, 

9 FB & _syinb2, : 

10 double & _qpsk, 

11 double &_diff _mode) : base (name) 
25 12 { 

13 symbl = _syTnbl . asSource ( this) ; 

14 symb2 = _symb2 .asSink (this) ; 

15 qpsk = &_qpsk; 

16 diff mode= & diff _inode; 



30 17 reset 0; 
18 } 
19 
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20 int dif fenc: : reset 0 { 

21 iState= 0; 

22 qState= 0; 

23 return Ij 
5 24 } 

25 

26 int diffenc: :run() { 

27 . .. 

28 //firing rule 

10 29 if (syTnbl->getSize 0 < 1) 
30 return 0; 
31 

32 //core func 

33 intsymb = (int) Val (syTnbl->get ( ) ) ; 
15 34 

35 if { (int) *diff _mode) { 

36 int a = {(int)*qpsk) ? (symb>> 1) & 1 : (symb>> 3) & 
1 ; 

/ / get msb • s only 
20 37 int b = ((int)*qpsk) ? (symb>> 0) & 1 : (syTnb>> 2) & 

1 ; 
38 

39 int i = ((("(a^b)) & (a^iState) ) | (a (^b) 

&b(^qState) ) ) &1; // encodemsb 
25 40 int q = ((("(a"b)) & (b'qState) ) | (a ("b) 

&a(^iState) ) ) &1 ; 
41 

42 iState= i; 

43 qState= q; 
30 44 

45 symb = { (int) *qpsk) ? (i<< 1) |q : (i<< 3) | (q<< 

2) I (symbfic 3) ; 



X 
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46 } 
47 

48 symb2->put (symb) ; 
4 9 return 1 ; 
5 50 } 
51 

3.5 tx/map.h 



COPYRIGHT 



Copyrightl996 IMEC, Leuven, Belgium 
Allrights reserved. 



10 // Module: 
20 11 // MAP 

12 // 

13 // Purpose: 

14 // Mapping of QAM16 constellations to symbols and 
back 

25 15 // 

16 // Author: 

17 // Patrick Schaumont 

18 // 

19 

30 20#infdef MAP_H 
21#define MAP_H 
22 



2 // 

3 // 

4 // 

5 // 
15 6 // 

7 // 

8 // 
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2 3 # inc lude " ql ib . h " 
24 

25 classmap : public base{ 

26 double *qpsk; 
5 27 

28 FB * sin; 

29 FB * qOut; 

30 FB * iOut; 
31 

10 32 dfix immediateQ(dfix v) ;. 
33 dfix immediatel (dfix v) ; 
34 

3 5 public: 

36 map (char *name, FB& _sIn,FB & _iOut, FB& _qOut, double 
15 &_qpsk) ; 

37 int runO ; 
38 

39 }; 
40 

20 41#endif 

3 . 6 tx/map . cxx 

1 // 

25 2 // COPYRIGHT 

3 // ========= 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 

6 // 

30 7 // Allrights reserved. 

8 // 

9 // 
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10 // Module: 

11 // MAP 

12 // 

13 // Purpose: 

5 14 // Mapping of QAM16 constellationsto syitibolsand back 

15 // 

16 // Author: 

17 // Patrick Schaumont 

18 // 

10 19 

20 

21#include "map.h" 
22 

23 // # # ## ##### 
15 24 //###### tt # 

25 // #### # # # tt 

26 // # # ###### ##### 

27 // # # # # tt 

28 // # # # # 
20 29 

30 

31 // QAM16 

32 static double vQMapl6[]={ 

33 ( 0.0) , 

25 34 (+1 .0), (+1.0), (+3.0), (+3.0), 

35 (-1 .0), (-3.0), (-1.0), (-3.0), 

36 (+1 .0), (+3.0), (+1.0), (+3.0), 

37 (-1 .0), (-3.0), (-1.0), (-3.0) 

38 }; 
30 39 

40 static double vIMaplS [] = { 

41 ( 0.0) , 
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42 . (+1 .0), (+3.0), (+1.0), (+3.0), 

43 ( + 1 .0), ( + 1.0), (+3.0),. (+3.0), 

44 (-1 .0), (-1.0), (-3.0), (-3.0), 

45 (-1 .0) , (-1.0) , (-3.0) , (-3.0) 
5 46 }; 

47 

48 // QPSK 

49 static double vQMap4[]={ 

50 ( 0.0) , 

10 51 (+3 .0), (-3.0), (+3.0), (-3.0), 

52 }; 

53 static double vIMap4[] = { 

54 ( 0.0) , 

55 (+3 .0), (+3.0), (-3.0), (-3.0), 
15 56 } ; 

57 

58 map: :map(char*name, FB& _sIn,FB & _iOut, FB& 
_qOut, doubles qpsk) : base (name) { 

59 sin = Sc _sln; 
20 60 qOut = Sc _qOut; 

61 iOut= & _iOut; 

62 qpsk= Sc _qpsk; 

63 } 
64 

25 65 dfix map: :immediateQ(dfixv) { 

66 if ( (int) *qpsk) { 

67 return df ix(vQMap4 [ (int) Val (v+1) 1 ) ; 

68 } else{ 

69 return dfix (vQMapl6 [ (int ) Val (v+1) ] ) ; 
30 70 } 

71 } 
72 
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73 dfix map: : immediatel (df ixv) { 

74 if ( (int) *qpsk) { 

75 return dfix (vIMap4 [ (int) Val (v+1) ] ) 

76 } else{ 

5 77 return df ix(vIMapl6 [ (int) Val(v+1) ] ) 

78 } 

79 } 
80 

8 1 intmap : : run ( ) { 
10 82 if (sIn->getSize 0 < 1) 

83 return 0; 

84 dfix V = sln->get(); 

85 *iOut << inunediatel (v) ; 

86 *qOut << immediateQ ( v) ; 
15 87 return 1; 

88 } 
89 



3.7 tx/rnd.h 



20 



1 // rnd.h 

2 // All rights reserved -- Imecl998 

3 // ®(#)rnd.h 1.5 03/31/98 
4 

25 5#infdef RND_H 
6#def ine RND_H 
7 

8#include "qlib.h" 
9 

30 10#define SYNCPERIOD 54 
ll#define SYNCWORDl 0x00 
12#define SYNCW0RD2 0x55 
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13#define SYNCW0RD3 0x00 



14#define SYNCWORD4 0x55 



15 



16 



class rnd : public base{ 



5 



17 



FB 



* input ; 



18 



FB 



♦output; 



19 



int 



synccntr; 



20 

21 public: 

10 22 rnd (char *name, FB& _input, FB& ^output) ; 

23 int runO ; 

24 int reset {) ; 

25 }; 
26 

15 27#endif 

3-8 tx/rnd.cxx 

1 // rnd.cxx 
20 2 // All rights reserved -- Imec 1998 
3 // @(tt)rnd.cxx 1.6 03/20/98 
4 

5 # include " rnd . h " 
6 

25 7 int glbRandom = 1; 
8 

9 int glbRandState; 
10 

11 rnd::md(char *name, 
30 12 FB & _input, 

13 FB & _output) : base (name) 



14 { 
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15 input = _input . asSource (this) ; 

16 output= _output .asSink(this) ; 

17 synccntr= 0; 

18 reset 0 ; 
5 19 } 

20 
21 

22#define BIT(k, n) ( (k>> (n-l) ) & 1) 
23#define MASK(k, n) (k & ( (1« (n+l))-l)) 
10 24 

25 int randbitO { 

26 int r; 

D 

^ 27 

^ 28 r= BIT(glbRandState, 5) * BIT(glbRandState, 6 ); 

W 15 29 glbRandState= MASK(r | (glbRandState<< 1) , 6) ; 

m 

m 30 

W 31 if (glbRandom) 



p 32 return r; 

m 
a 

20 34 return 0; 



O 



33 else 



35 } 

36 

37 

3Q ==========================================MEMBER 

25 FUNCTIONS 
39 

40 int rnd: : reset 0 ( 

41 //return to initial state 

42 glbRandState= (1<< 7) -1; 
30 43 return 1; 

44 } 
45 
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46 int rnd: :run() { 

47 //firing rule 

48 if (input->getSize 0 < 1) { 
4 9 return 0 ; 

5 50 } 
51 
52 

53 //core func 

54 int i; 

10 55 int outbyte = 0; 

56 int inbyte = (int) Val (input ->get ( ) ) ; 

57 for (i=7; i>=0; i--) { 

58 outbyte= (outbyte<<l) | (randbit( ) ^(inbyte>>i & 

D) ; 

15 59 } 

60 synccntr++; 

61 if (synccntr == SYNCPERIOD) { 

62 // cerr << "*** INFO: randomizer sends SYN\n" ; 

63 output ->put (outbyte) ; 
20 64 output - >put (SYNCWORDl) 

6 5 output - >put ( SYNCW0RD2 ) 

6 6 output - >put ( S YNCW0RD3 ) 

6 7 output - >put ( SYNC:W0RD4 ) 

68 synccntr= 0; 

25 69 reset () ; 

70 } 

71 else { 

72 output ->put (outbyte) ; 

73 } 
30 74 return 1; 

75 } 
76 
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77 



3.9 tx/ shape. h 



5 



1 // shape. h 

2 // All rights reserved Imec 1998 

3 // ®(#)shape.h 1,3 03/18/98 



5#infdef SHAPE_H 
10 6#define SHAPE_H 
7 

8#include "qlib.h" 
9 

10#define MAXLEN 33 
15 11 

12 class shape : public base{ 

13 FB * i_in; 

14 FB * q_in; 

15 FB * s_out; 

20 16 double c [MAXLEN] ; // RC coefficients 
17 

18 public: 

19 shape (char *name, FB& _i_in, FB& _q„in, FB& _s_out) 

20 int runO ; 

25 21 int run_old() ; 

22 int reset 0 ; 

23 void makecoef f s ( ) ; 

24 }; 
25 

30 26#endif 



3.10 



tx/shape. cxx 



yi 

I 2 2 
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1 // shape. cxx 

2 // All rights reserved --. Imec 1998 

3 // ®(#)shape.cxx 1.7 06/26/98 
4 

5 5#include " shape. h" 
6 

7 shape: : shape (char *name, 

8 FB & _i_in, 

9 FB & 

10 10 FB & _s_out) :base(name) 

11 { 

12 i_in = _i_in.asSource (this) ; 

13 q_in = _q_in.asSource (this) ; 

14 s_out = _s_out .asSink(this) ; 

15 15 makecoeffs( ) ;//RRC coeff generation 

16 reset () ; 

17 } 
18 

19 int shape :: reset 0 { 

20 20 //return to initial state 

21 while (i_in->getSize 0 >0) 

22 i_in->pop 0 ; 

23 while (qLin->getSize 0 >0) 

24 qLin->pop 0 ; 
25 25 

26 return 1; 

27 } 
28 

29 void shape : rmakecoeffs 0 { 

30 30 c[0] = 2.725985e-02; 

31 c[l] = 2 .079339e-01; 

32 c[2] - 4 .0026016-01; 
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37 


c[7] 


= 2 


. 725985e- 


02 ; 


38 


} 








39 










40 


int shape: 


: run { ) { 




41 


int 


i /j 







10 42 #define NF 8 

43 #define SPS 4 
44 

45 static double deli[NF] ; 

46 static double delq[NF] ; 
15 47 

48 if ( (i_in->getSize 0 <1) | | 

49 (q_in->getSize 0 <1)) { 

50 return 0; 

51 } 
20 52 

53 for (j = 1; j <= SPS; j++) { 
54 



55 for (i = NF-1; i>= 1; i--) { 

56 deliti] = deli[i-l] ; 
25 57 delq[i] = delqti-lj ; 

58 } 

59 if(j == 1) { 

60 deli[0] = Val (i_in->get ( ) ) 

61 delq[0] = Val (q_in->get ( ) ) 
30 62 } 

63 else{ 

64 delitO] =0; 
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65 




aeicjLUj = 




66 
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67 
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69 




double acca = 0; 




/ u 




f or ^ i = 0* i < NF; i + + ) ( 




/ X 




^rr-T 4-= deli fil *c f il ; 
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case ^! S out. >^U.U \ cH-^U(^/ ,iJica.j\., 




/ O 
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//end for j 
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o yi 


return 1 ; 
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25 


89 


// 


5.9502848187909857e-03 




90 


// 


7. 13033394 18111898e-03 




91 


// 


-9.03761259588586526-04 




92 


// 


-1.2842591240125096e-02 




93 


// 


-1.6560488829370935e-02 


30 


94 


// 


-3.1424796453581099e-03 




95 


// 


2 .2511451978267195e-02 




96 


// 


4 .04 658408022610046-02 
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97 // 2.8302892670230756e-02 

98 // -1.90560646403678366-02 

99 // -7.68140405160839816-02 
100// -9.74648750810183376-02 

5 101// -3.75066707424251556-02 

102// 1.11360917747299676-01 

103// 3.07720918719061656-01 

104// 4.75264687991420916-01 

105// 5.41071089895509896-01 

10 106// 4.75264677885257896-01 

107// 3.07720903048603506-01 

108// 1.11360903073354936-01 

109// -3.75066793140987416-02 

110// -9.74648762354659866-02 

15 111// -7.68140366836890666-02 

112// -1.90560599037036056-02 

113// 2.83028951708836536-02 

114// 4.04658403348644176-02 

115// 2.25114499014365396-02 

20 116// -3.14248138927888606-03 

117// -1.65604891696671606-02 

118// -1.28425904401759736-02 

119// -9.03760325914961016-04 

120// 7.13033421995458796-03 

25 121// 5.95028441003955896-03 
122 

3,11 tx/tupl6lize.h 

30 1 // tuplelize.h 

2 // All rights r6S6rv6d -- Imec 1998 

3 // ®(#) tupl6lize.h 1.4 98/03/31 
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4 
5 

6#infdef TUPLELIZE_H 
7#define TUPLELIZE_H 
5 8 

9#include "qlib.h" 
10 

11 class tuplelize : public base{ 

12 FB *byte; 
10 13 FB *syTnb; 

14 double *qpsk; 
15 

16 public: 

17 tuplelize (char* name, 
15 18 FB & _byte, 

19 FB & _syTnb, 

2 0 double &_qpsk) ; 

2 1 int run ( ) ; 

22 int reset () ; 
20 23 }; 

24 

25#endif 

3 . 12 tx/ tuplelize . cxx 

25 

1 // tuplelize. cxx 

2 // All rights reserved-- Imec 1998 

3 // ©{#) tuplelize. cxx 1,698/03/31 
4 

30 5#include "tuplelize . h" 
6 
7 
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8 tuplelize : : tuplelize (char *name, 

9 FB & _byte, 

10 FB & _syTnb, 

11 double &_qpsk) :base(name) 
5 12 { 

13 byte = _byte . asSource (this) ; 

14 symb = _symb, asSink (this) ; 

15 qpsk = &_qpsk; 

16 } 
10 17 

18// 

19 

20 int tuplelize: : reset 0 { 

21 return 1; 
15 22 } 

23 

24 int tuplelize :: run 0 { 
25 

2 6 //firing rule 
20 27 if (byte->getSize 0 < 1) 

2 8 return 0 ; 
29 

30 //core func 

31 int us, msk, sym; 
25 32 

33 if ( (int) *qpsk) { 

34 us= 2; msk = 0x03; 

35 } else{ 

3 6 us= 4; msk = OxOF; 
30 37 } 

38 

39 int tuple = (int ) Val (byte- >get ( ) ) ; 



190 

40 

41 for (int k = 1; k<= 8/us;k++) { 

42 sym = (tuple >> (8-us) ) & msk; 

43 tuple= (tuple << us) & Oxff; 
5 44 syrab->put (sym) ; 

45 } 
46 

47 return 1; 

48 } 
10 49 

50 
51 



15 4 Channel Model Code 

4.1 chan/fir.h 
1 // fir.h 

20 2 // All rights reserved -- Imec 1998 
3 // ®(#)fir.h 1.2 03/31/98 
4 

5#infdef FIR__H 
6#def ine FIR_H 
25 7 

8#define NRTAPS 20 
9 

1 0 # inc 1 ude " ql ib . h " 
11 

30 12 class fir : public base{ 

13 FB * input ; 

14 FB *output ; 
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15 double x[NRTAPS] ; // filtertaps: 0 , 1 , . . . , NRTAPS- 1 

16 double *tl, *t2, *t3, *t20; 
17 

18 public: 

5 19 fir (char *name,FB & _input,FB & _output, 

20 double &_tl, double &_t2, double &_t3, double &_t20) 

21 int run() ; 

22 int reset () ; 
10 .23 }; 

24 

25#endif 

4.2 chan/fir.cxx 



20 5#include "fir.h" 
6 

7 f ir: : fir (char *name, 

8 FB & _input, 

9 FB & _output, 

25 10 double &_tl, double &_t2, double &_t3, double 

& t20) : base (name) 



15 



1 // fir.cxx 



2 // All rights reserved Imec 1998 

3 // ®(#)fir.cxx 1.3 03/31/98 



4 



11 { 



13 



12 



input = _input .asSource (this) ; 
output= _output .asSink(this) ; 



30 



14 



15 



16 



for (int i=0; i<NRTAPS; i++) { 
X [i] =0; 
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17 } 

18 tl = &_tl; 

19 t2 = &_t2; 

20 t3 = &_t3; 
5 21 t20= &_t20; 

22 } 
23 

24 int fir::reset() { 

25 //return to initial state 

10 26 for (int i=0; i<NRTAPS; i++) { 

27 X [i] =0; 

28 } 

29 return 1; 

30 } 
15 31 

32 int fir: :run() { 

33 //firing rule 

34 if (input->getSize 0 < 1) { 

35 return 0; 
20 36 } 

37 

38 dfix in = input->get () ; 
39 

40 int i; 
25 41 for (i=NRTAPS-l; i>=l; i--) { 

42 X [i] =x[i-l] ; 

43 } 

44 x[0] =Val (in) ; 
45 

30 46 //core func 

47 double out = xtO] + x[l]*(*tl) +x[2]*(*t2) + 

x[3]*(*t3) + x(20]*(*t20) ; 
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48 output ->put (out) ; 
49 

50 return 1 ; 

51 } 
5 52 

53 




4.3 chan/noise.h 



10 1 // noise. h 

2 // All rights reserved Imec 1998 

3 // ®(#)noise.h 1.2 03/20/98 
4 

5#infdef NOISE_H 
15 6#define NOISE_H 
7 

8#include "qlib.h" 
9ttinclude "pseudorn . h" 
10 

20 11 class noise: public .base{ 

12 FB * in; 

13 FB * out; 

14 double *n; 

15 pseudorn RN; 
25 16 

17 public: 

18 noise (char *name, FB & in,FB & out, double & _n) ; 

19 int reset () ; 

20 int runO ; 
30 21 }; 

22 

23»endif 
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4.4 chan/noise.cxx 

1 // noise. cxx 
5 2 // All rights reserved -- Imec 1998 
3 // ®(#)noise.cxx 1.3 03/20/98 
4 

5#include "noise. h" 
6#include <math.h> 
10 7 

8 noise: : noise (char *naTne,FB & _in,FB & _out, double & _n) 

base (name) { 

9 in = _in.asSource (this) ; 
15 10 out= _out .asSink(this) ; 

11 n= &_n; 

12 } 
13 
14 

20 15 int noise:: run 0 { 

16 //firing rule 

17 if (in->getSize() < 1) { 

18 return 0; 

19 } 
25 20 

21 //core function 

22 double Ul = (double) (RN. out ())/ (double) PRNMAX + 
1/ (double) PRNMAX; 

23 double U2 = (double) (RN. out ())/ (double) PRNMAX + 
30 1/ (double) PRNMAX; 

24 

25 double X = sqrt (-2 . *log (Ul) ) *cos (2 . *M_PI*U2) ; 
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26 

27 out->put (Val (in->get 0 ) . +X* (*n) ) ; 
28 

29 return 1; 
5 30 
31 } 

4 . 5 chan/pseudorn . h 

10 1 // pseudorn.h 

2 // All rights reserved Imec 1998 

3 // ®(#) pseudorn.h 1.2 03/31/98 
4 

5#infdef pseudorn_H 
15 6#define pseudorn_H 
7 

8#define MULT 0x015a4e35L 

9#define INCH 1 
lOttdefine PRNMAX 32767 // =2^15-1 
20 11 

12 #include <time . h> 
13 

14 class pseudorn { 

15 long seed; 

25 16 unsigned range; 

17 public: 

18 pseudorn 0 { 

19 range = PRNMAX; 

20 seed= time(O); 
30 21 } 

22 pseudorn (unsigned s, unsigned r) { 

23 seed= s; 
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24 range = r; 

25 } 

26 pseudorn (unsigned r) { 

27 range = r; 

5 28 seed = time(O); 

29 } 

30 unsigned out (void ) { 

31 seed= MULT * seed+ INCR; 

32 return ((unsigned) (seed>> 16) & 0x7fff) % range 
10 33 } 

34 long getSeedO {return seed;} 

35 void setSeeddong s) {seed= s;} 

36 }; 
37 

15 38 

3 9#include "qlib . h" 
40 

41 class pseudorn _gen: publicbase { 

42 pseudorn RN; 
20 43 FB *out; 

44 public: 

45 pseudorn__gen(char *name, FB&_out) : 

46 base (name) , 

47 RN(255) { 

25 48 out= _out .asSink(this) ; 

49 } 

50 int runO { 

51 out->put (RN.out ( ) ) ; 

52 return 1; 
30 53 } 

54 }; 
55 
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56#endif 



57 



58 



5 



4 . 6 chan/pseudorn . cxx 



1 // pseudorn.cxx 

2 // All rights reserved Imec 1998 

3 // ®(#)pseudorn.cxxl.l 03/17/98 
10 4 

5# include "pseudorn . h" 
6 

7 // inlinedstuff 
8 



20 

l#infdef DRIVER_H 
2#define DRIVER_H 
3 

4 // ®(#) driver. hi. 2 98/03/20 
25 5 

6#include "qlib.h" 
7#include "Callback2wRet . h" 
8 

9 class interpreter{ 
30 10 public: 

11 interpreter ( ) ; 

12 void add (sysgen &s ) ; 



15 



5 



System Code 



5 . 1 driver/driver . h 
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13 void observe (double &v,char *name) ; 

14 void obsAttr (Callback2wRet < int, double, int> 
cb, int , char 

*name) ; 

5 15 friend interpreter & operator<< (interpreter &p 

, sysgen &s ) ; 

16 friend interpreter & operator<< (interpreter &p , elk 
Scc) ; 

17 void go (int argc^char **argv) ; 
10 18 }; 

19 
20 
21 
22 

15 23 

24#endif 

5 . 2 driver/ driver . cxx 

20 l#include "tcl.h" 

2#include <iostream.h> 
3 

4#define MAKE_WISH 
5 

25 6#ifdef MAKE_WISH 
7#include "tk.h" 
8#endif 
9 

10 // ®(#) driver. cxx 1.3 98/03/27 
30 11 

12#include "qlib.h" 
13#include "qtb . h" 
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14#include "driver .h" 
15#include "Callback2wRet . b" 
16 

\1 1 1 interpreter OCAPI-related datastructures 

5 // 

18 

19 Callback2wRet<int, double, int>functorlist [100] ; 

20 int nuTnfunctors= 0; 
21 

10 22 int graphLines= 0; 
23 

24 FBQ (traceO) ; 

25 FBQ (tracel) ; 

26 FBQ (trace2) ; 
15 27 FBQ (trace3) ; 

28 FBQ (trace4) ; 

29 FBQ (traces) ; 

30 FBQ (trace6) ; 

31 FBQ (trace?) ; 

20 32 dfbfix *traces[8] ; 

33 dfbfix *tracedqueue [8] ; 
34 

35 Tcl_HashTable queue_hash; 
36 

25 37#define IF_SUFFIX(A) if ( (strlen (r->name () ) > 

strlen(A)) && 

( !strcTnp(r->name 0 +strlen (r->naTne () ) - strlen (A) ,A))) 

38 
39 

30 40 void create_queue_hash 0 { 

41 Tcl_InitHashTable (&queue_hash, TCL_STRING_KEYS) ; 
42 
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43 dfbfix *r; 

44 for(r = listOfFB; r; r=. r- >nextFB ( ) ) { 

45 int present; 

46 IF_SUFFIX ( "_mark" ) 
5 47 continue; 

48 IF_SUFFIX ( "_stim" ) 

49 continue; 

50 Tcl_SetHashValue {Tcl_CreateHashEntry (&queue_hash, r- 
>name() ,&present) , (char *) r) ; 

10 51 } 

52 } 
53 

54 // next are created by the interpreter object itself 

55 Tcl_HashTable sched_hash; 
15 56 Tcl_HashTable doubles_hash; 

57 Tcl_HashTable attr_hashf unc; 

58 Tcl_HashTable attr_hashint ; 
59 

60 elk* glbClk;// global (single) clock 
20 61 

62// 

---// 

63 int ListQueue (ClientData, Tcl_Interp*interp, intargc, 
char 

25 **argv) { 

64 if ( (argc > 2)) { 

65 interp->result= "Usage :_listq_?queue?\n" ; 

66 return TCL_ERROR; 

67 } 
30 68 

69 char *match = 0; 

70 if (argc == 2) { 
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71 match = argv(l] ; 

72 } 
73 

74 if (match) { 
5 75 Tcl_HashEntry*p= 
Tcl_FindHashEnt ry ( &queue_hash , argv [ 1 ] ) ; 

76 if(p != 0) { 

77 Tcl_AppendElement (interp, (d(fbf ix*) 
Tcl_GetHashValue (p) ) - 

10 >name ( ) ) ; 

78 } 

79 } else{ 

80 Tcl_HashSearch k; 

81 Tcl_HashEntry *p= Tel 
15 _FirstHashEntry (&queue_hash,k&) ; 

82 while (p 1= 0) { 

83 Tcl_AppendElement (interp, ( (dfbf ix *) 
Tcl_GetHashValue (p) ) ->name ( ) ) ; 

84 p = Tcl_NextHashEntry (&k) ; 
20 85 } 

86 } 
87 

88 return TCL_OK; 

89 } 
25 90 

91// 

---// 

92 int GetQueue (ClientData , Tel _Interp * interp,int 
argc, char 

30 **argv) { 

93 if(argc != 2) { 

94 interp->result= "Usage :_getq_queue\n" ; 
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95 return TCL_ERROR; 

96 } 
97 

98 Tcl_HashEntry*p 
5 Tcl_FindHashEntry (&queue_hash,argv[l] ) ; 

99 if(p != 0) { 

100 dfbfix *q = (dfbfix *) Tcl_GetHashValue (p) ; 

101 while (q->getSize() ) { 

102 strstream N; 

10 103 N « Val (q->get 0 ) <<ends; 

104 Tcl_AppendElement (interp,N. str ( ) ) ; 

105 } 

106 } 
107 

15 108 return TCL_OK; 
109} 
110 

111 // 

----// 

20 112 intPutQueue (ClientData , Tel _Interp * interp,int 
argc , char 

**argv) { 

113 if (argc != 3) { 

114 interp->result= "Usage :jputq_queue_value\n" ; 
25 115 return TCL_ERROR; 

116 } 
117 

118 Tcl^HashEntry *p 

Tcl_FindHashEntry ( &queue_hash, argv [ 1 ] ) ; 
30 119 if(p != 0) { 

120 double v; 

121 sscanf {argv[2] ,"%lf",v&); 
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Tel _Interp 



122 dfbfix *q = (dfbfix *) Tcl_GetHashValue (p) ; 

123 q->put (v) ; 

124 } 
125 

5 126 return TCL_OK; 
127} 
128 

129 // 

// 

10 130 int TraceQueue (ClientData, 
interp, intargc, char 
**argv) { 

131 

132 if{(argc != l)&&(argc!= 3 )) { 
15 133 

"Usage :_traceq_?traceq_queuename?\n" ; 

134 return TCL_ERROR; 

135 } 
136 

20 137 if(argc == 1) { 

138 intk; 

139 for(k=0; k<8; k++) { 

140 strstream N; 

141 N << traces [k] ->name () <<"_"; 
25 142 if (tracedqueue[kl !=0) 

143 N << tracedqueue [k] ->name 0 ; 

144 N « ends; 

145 Tcl_AppendElement (interp,N.str ( ) ) ; 

146 } 
30 147 } else{ 

14 8 Tcl_HashEntry 
Tcl_FindHashEntry (&queue_hash, argv [2] ) ; 



interp- >result = 




204 

149 dfbfix *q = 0; 

150 if(p != 0) { 

151 q = (dfbfix *) Tcl_GetHashValue (p) ; 

152 } else { 

5 153 return TCL__OK ; 

154 } 
155 

156 int num; 

157 for (num=0; num < 8;num++) { 

10 158 if ( !strcmp(argv[l] , traces [num] ->name ()) ) 

159 break; 

160 } 
161 

162 if (num > 7) 
15 163 return TCL_OK; 

164 

165 if (tracedqueue [num] !=0) { 

166 tracedqueue [num] ->asDup (nilFB) ; 

167 } 
20 168 

169 tracedqueue [num] =q; 

170 q->asDup(* traces [num] ) ; 

171 } 

172 return TCL_OK; 
25 173} 

174 

175 // 

---// 

176 intReadQueueCClientData , Tcl_Interp * interp, intargc, 
30 char 

**argv) { 

177 if(argc != 2) { 
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178 interp->result= "Usage :_readqLqueue\n" ; 

179 return TCL_ERROR; 

180 } 
181 

5 182 Tcl_HashEntry *p 

Tcl_FindHashEntry (ficqueue_hash, argv [1] ) ; 

183 if(p != 0) { 

184 dfbfix *q = (dfbfix *) Tcl_GetHashValue (p) ; 

185 int k; 

10 186 for(k=0; k<q->getSize ( ); k++) { 

187 strstream N; 

188 N << Val((*q) [k] ) << ends; 

189 Tcl_AppendElement (interp,N.str ( ) ); 

190 } 
15 191 } 

192 

193 re t urn TCL__OK ; 

194 } 
195 

20 196 // 

-// 

197 int PlotQueue (ClientData, Tcl_Interp * interp, intargc, 
char 

**argv) { 
25 198 inti; 

199 if(argc < 2) { 

200 interp->result= "Usage :_plotq_queue_? . . . ?\n" ; 

201 return TCL_ERROR ; 

202 } 
30 203 

204 char *f = tmpnain(NULL) ; 

205 ofstream PLOTBUF(f); 



206 

206 

207 // headers 

208 PLOTBUF << "TitleText :_" ; 

209 for(i=l; i<argc; i++) { ^ 

5 210 Tcl_HashEntry *p= 

Tcl_FindHashEntry (&queue_hash,argv[i] ) ; 

211 if(p != 0) 

212 PLOTBUF << ( (dfbf ix *) Tcl_GetHashValue (p) ) ->name ( ) 
<<"_" ; 

10 213 } 

214 PLOTBUF << "\n"; 
215 

216 PLOTBUF << "BackGround :_Black\n" ; 

217 PLOTBUF << "ForeGround :_White\n" ; 
15 218 PLOTBUF << "XUnitText: Sample\n" ; 

219 PLOTBUF << "BoundBox: True\n" ; 

220 PLOTBUF « "0. Color Yellow\n"; 

221 PLOTBUF << "LabelFont: -adobe-helvetica-*-r-*-*-16-*- 

20 *-*\n"; 

222 PLOTBUF << "Markers: True\n"; 

223 if{ IgraphLines) 

224 PLOTBUF << "NoLines: True\n"; 

225 

25 226 // data 

227 for{i=l; i<argc; i++) { 

228 PLOTBUF « "\n"; 

22 9 Tcl_HashEntry *p= 

Tcl_FindHashEntry (&queue_hash,argv[i] ) ; 
30 230 if(p != 0) { 
231 int j; 



i 
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232 PLOTBUF << "\""<< (( dfbfix*) Tcl_GetHashValue (p) ) - 
>name ( ) 

<<"\"\n"; 

233 for {j=0; j<( (dfbfix*) Tcl_GetHashValue (p) ) - 
5 >getSize {) ; 

j++) { 

234 PLOTBUF << j << "_"<< {{dfbfix 
*) Tcl_GetHashValue (p) ) - 

>getlndex(j) <<"\n" ; 
10 235 } 

236 } 

237 } 

238 PLOTBUF.close{) ; 
239 

15 240 system{strapp(strapp("xgraph__" , f ),"_&") ) ; 
241 re turn TCL_OK ; 
242} 
243 

244 // 

20 // 

245 int ScatQueue (ClientData, Tel _Interp * interp, intargc, 
char 

**argv) { 

246 int i; 

25 247 if(argc != 3) { 

248 interp->result= "Usage :_scatq_queuex__queuey\n" ; 

249 re turn TCL_ERROR ; 

250 } 
251 

30 252 ofstream PLOTBUF (" .plotbuf" ) ; 
253 

254 // headers 
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255 PLOTBUF << "TitleText :_" ; 

256 for(i=l; i<argc; i++) {• 

257 Tcl_HashEntry *p 
Tcl_FinciHashEntry {&queue_hash, argv[i] ) ; 

5 258 if(p != 0) 

259 PIjOTBUF << ((dfbfix *) Tcl_GetHashValue (p) ) ->name ( 
<<"_" ; 

260 } 

261 PLOTBUF << "\n"; 
10 262 

263 PLOTBUF << "BackGround :__Black\n" ; 

264 PLOTBUF << "ForeGround:_White\n" ; 

265 PLOTBUF << "XUnitText: Sainple\n" ; 

266 PLOTBUF << "BoundBox: True\n" ; 

15 2 67 PLOTBUF << "0. Color: Yellow\n" ; 

268 PLOTBUF << "LabelFont : -adobe-helvetica-*-r-*-*-16-* 

*_*_*-*- 

*-*\n" ; 

269 PLOTBUF << "Markers: True\n" ; 

20 270 if( IgraphLines) 

271 PLOTBUF << "NoLines: True\n"; 

272 

273 // data 

274 PLOTBUF << "\n"; 

25 275 Tcl_HashEntry * pi 

Tcl_FindHashEntry (&queue_hash, argv [1] ) ; 

276 Tcl_HashEntry * p2 
Tcl_FindHashEntry (&queue_hash,argv[2] ) ; 

277 if ((pi != 0)&&(p2 != 0)) { 
30 278 int j ; 

279 int max = ((dfbfix *) Tcl_GetHashValue (pi) ) 

>getSize () ; 
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280 if(((dfbfix *) Tcl_GetHashValue (p2) ) ->getSize () 
<max) { 

281 max = ( ( (dfbf ix *) Tcl_GetHashValue (p2) ) ->getSize ( 
) ) ; 

5 282 } 

283 for(j=0; j<max; j++) { 

284 PLOTBUF « ( (dfbf ix *) Tcl_GetHashValue (pi) ) - 
>getlndex ( j ) 

285 « 

10 286 « ({dfbfix *) Tcl_GetHashValue (p2) ) - 

>getlndex( j ) <<"\n" ; 
O 287 } 

S 288 } 

H 289 PLOTBUF . close ( ) ; 

H! 291 system ( "xgraph_.plotbuf_Sc" ) ; 

= 292 return TCL_OK; 

m 293} 
O 294 

p 20 295 // 

S — // 

296 int StatQueue (ClientData, Tel _Interp*interp, intargc, 
char 

**argv) { 
25 297 if (argc > 2) { 

298 interp->result= "Usage :_s tatq_? queue? \n" ; 

299 return TCL_ERROR ; 

300 } 
301 

30 302 char *match = 0; 

303 if (argc == 2) { 

304 match = argv[l] ; 



# 

210 

305 } 
306 

307 dfbfix *r; 

308 for(r = listOfFB; r; r= r->nextFB()) { 
5 309 IF_SUFFIX("_mark") 

310 continue; 

311 IF_SUFFIX ( "_stiTn" ) 

312 continue; 

313 if { imatch || (s ! trcmp (r->name ( ), match))) { 
10 314 strstreamN; 

315 N << *r << ends; 

316 Tcl_AppendElement {interp,N-str ( ) ) ; 

317 } 
318 

15 319 } 
320 

321 return TCL_OK; 

322} 

323 

20 324 // 

// 

325 int ClearQueue (ClientData, Tel _Interp*interp, intargc, 
char 

**) { 

25 326 if(argc > 1) { 

327 interp->result= "Usage :_clearq\n" ; 

328 return TCL_ERROR; 

329 } 
330 

30 331 dfbfix *r; 

332 for(r = listOfFB; r; r= r->nextFB()) 

333 while (r->getSize () >0 ) 
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334 r->pop() ; 

335 

336 return TCL__OK; 
337} 
5 338 

339 // 

----// 

340 int ListSchedule (ClientData, Tel _Interp*interp, 
intargc, char 

10 **argv) { 

341 if ((argc > 2)) { 

342 interp->result= "Usage :_lists_?schedule?\n" ; 

343 return TCL^ERROR; 

344 } 
15 345 

346 char *Tnatch = 0; 

347 if (argc == 2) { 

348 match = argv[l] ; 

349 } 
20 350 

351 if (match) { 

352 Tcl_HashEntry *p= Tcl_FindHashEntry (&sched 
_hash , argv [ 1 ] ) ; 

353 if(p != 0) { 

25 354 Tcl_AppendElement (interp, { (sysgen *) 

Tcl_GetHashValue (p) ) ->getname ( ) ) ; 

355 } 

356 } else{ 

357 Tcl_HashSearchk ; 

30 3 58 Tcl_HashEntry * p= Tel _FirstHashEntry (&sched 

__hash, k&) ; 

359 while (p != 0) { 



360 



Tcl_AppendElement (interp, 
Tcl_GetHashValue (p) ) - 
>getname ( ) ) ; 

p = Tcl_NextHashEntry (&k) ; 



( (sysgen*) 



361 



5 



10 




25 

i 



30 



362 } 

363 } 
364 

365 return TCL_OK; 

366} 

367 

368 // 

--// 

369 int RunSchedule (ClientData, Tel _Interp*interp, intargc, 
char 

**argv) { 

370 

371 if ( (argc != 3) ) { 

372 interp->result= 
"Usage :_runs_schedule_clock_iterations\n" ; 

3 73 return TCL_ERROR; 

374 } 

375 

376 Tcl_HashEntry *p = Tcl_FindHashEntry (&sched 
_hash, argv [1] ) ; 

377 if(p != 0) { 

378 unsigned v; 

379 sscanf (argv[2] ,"%d",Sev); 

380 sysgen *sys = (sysgen *) Tcl_GetHashValue (p) ; 
381 

382 while (v--) 

383 sys->run(*glbClk) ; . 



384 




385 } 
386 

387 return TCL_OK; 
388} 
5 389 

390 // 

---// 

391 int VhdlSchedule (ClientData,Tcl _Interp *interp, 
intargc, char 

10 **argv) { 

392 

393 if ((argc != 2)) { 

394 interp->result= "Usage :_vhdls_schedule\n" ; 

395 return TCL_ERROR; 
15 396 } 

397 

3 98 Tcl_HashEntry*p = Tcl_FindHashEntry (&sched 

_hash, argv [1] ) ; 
399 if(p 1= 0) { 
20 400 sysgen *sys = (sysgen *) Tcl_GetHashValue (p) ; 

401 sys->vhdlook() ; 

402 } 
403 

404 return TCL_OK; 
25 405} 
406. 

407 // 

// 

408 int ListParameter (ClientData, Tcl_Interp*interp, int 
30 argc, char 

**argv) { 

409 if ( (argc > 2) ) { 
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410 interp- >result= "Usage :_listp_?parameter?\n" ; 

411 return TCL_ERROR; 

412 } 
413 

5 414 char *match = 0; 

415 if(argc == 2) { 

416 match = argv[l] ; 

417 } 
418 

10 419 if (match) { 

420 Tcl_HashEntry *p 
Tcl_FindHashEntry {&:doubles_hash, argv [1] ) ; 

421 if(p != 0) { 
422 

15 Tcl_AppendElement ( interp, Tcl_GetHashKey{&doubles_hash,p) ) ; 

423 } 

424 } else{ 

425 Tcl_HashSearchk; 

42 6 Tcl_HashEntry *p 

20 Tcl_FirstHashEntry (&doubles_hash,k&) ; 
427 while (p != 0) { 
428 

Tcl_AppendElement (interp, Tcl_GetHashKey (&doubles__hash, p) ) ; 
429 p = Tcl_NextHashEntry (&k) ; 

25 430 } 
431 } 
432 

433 return TCL_OK; 
434} 

30 435 // 

--,// 
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436 int Set Parameter (ClientData, Tel _Interp *interp, 
intargc, char 

**argv) { 

437 if ( (argc != 3) ) { 

5 438 interp->result= "Usage :_setpjparameter_value\n" ; 

439 re turn TCL^ERROR ; 

440 } 
441 

442 Tcl^HashEntry *p 
10 Tcl_FindHashEntry (&doubles_hash, argv [1] ) ; 

443 if(p != 0) { 
— 444 double v; 

445 sscanf (argv[2] ,"%lf",&v); 
^ 446 double *q = (double *) Tcl_GetHashValue (p) ; . 

W 15 447 *q = v; 

m 448 } 

" 449 

0 450 return TCL_OK; 
Q 451} 

£ 20 452 

1 453 // 

---// 

454 int ReadParameter (ClientData, Tcl_Interp *interp, int 
argc, char 

25 **argv) { 

455 if (argc != 2) { 

456 interp->result= "Usage :_readp_parameter\n" ; 

457 return TCL_ERROR; 

458 } 
30 459 

460 Tcl_HashEntry *p 

Tcl_FindHashEntry (&doubles_hash,argv[l] ) ; 
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462 



461 



if(p != 0) { 
double *q = (double *) Tcl_GetHashValue (p) ; 



463 



strstreamN; 



464 



N << *q << ends; 



465 



Tcl_AppendElement (interp,N. str ( ) ) ; 



466 } 
467 

468 return TCL_OK; 

469} 

470 

471 // ^ 

-"// 

472 int ListAttribute (ClientData,Tcl _Interp *interp,int 
argc, char 

**argv) { 

473 if ( (argc > 2) ) ( 

474 interp->result= "Usage :_lista_?attribute?\n" ; 

475 return TCL_ERROR; 

476 } 
477 

478 char *Tnatch = 0; 

479 if (argc -= 2) { 

480 match = argvtl] ; 

481 } 
482 

483 if (match) { 

484 Tcl_HashEntry *p= 
Tcl_FindHashEntry {&attr_hashf unc, argv [1] ) ; 

485 if(p 0) { 
486 

Tcl_AppendElement (interp,Tcl_GetHashKey (&attr__hashfunc,p) ) ; 
487 } 
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488 } else{ 

489 Tcl_HashSearchk; 

490 Tcl_HashEntry *p= Tel _FirstHashEntry (&attr 
_hashfunc, &:k) ; 

5 491 while (p != 0) { 
492 

Tcl^AppendElement {interp,Tcl_GetHashKey (&attr_hashfunc,p) ) ; 

493 p = Tcl_NextHashEntry (fick) ; 

494 } 
10 495 ) 

496 

4 97 return TCL_OK; 

498} 

499 

15 500 // 

-"// 

501 int SetAt tribute (CI ientData,Tcl_Interp *interp, 
intargc, char 
**argv) { 
20 502 if ( (argc != 3) ) { 

503 interp->result= "Usage :_seta_attribute_value\n" ; 

504 return TCL_ERROR; 

505 } 
506 

25 507 Tcl_HashEntry *pf = 

Tcl_FindHashEntry (&attr_hashf unc, argv [1] ) ; 

508 Tcl_HashEntry *pi= 

Tcl_FindHashEntry (&attr_hashint,argv[l] ) ; 

509 

30 510 if(pf != 0) { 

511 int n = (int) Tcl_GetHashValue (pi) ; 

512 double v; 
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513 sscanf (argv[2] ,"%lf",&v); 

514 //call member func 

515 functorlist [ (int)Tcl_GetHashValue(pf ) 1 {n,v) ; 

516 } 
517 

518 return TCL_OK; 

519} 

520 

521 // 

// 

522 int SetLineStyle (ClientData, Tcl_Interp *interp, 
intargc, char 

**argv) { 

523 if ( (argc != 2) ) { 

524 interp->result= "Usage :_lines_l/0\n" ; 
52 5 return TCL_ERROR ; 

526 } 
527 

528 int V; 

529 sscanf (argv[l] ,nd", &v) ; 

530 if(v != 0) 

531 graphLines= 1 ; 

532 else 

533 graphLines= 0; 
534 

535 return TCL__OK; 

536} 

537 

538 // 

// 

539 int Testbenches (ClientData, Tcl^Interp *interp, intargc, 
char 



219 

**argv) ( 

540 if ((argc != 2)) { 

541 interp->result= "Usage :_testb_l/0\n" ; 

542 return TCL_ERROR; 

543 } 
544 

545 int V; 

546 sscanf (argv[l] ,"%d", &v) ; 

547 if{v 1= 0) 

548 qtb: :glbDisableTestbenches=0 ; 

549 else 

550 qtb: :glbDisableTestbenches=l; 
551 

552 return TCL__OK; 

553} 

554 

555 // . 

--// 

556 int OCAPIHelp (ClientData, Tcl_Interp *interp,int, char 
**) { 

557 Tcl_AppendElement (interp, "Available_OCAPI- 
related_commands : \n" ) ; 

558 

Tcl_AppendElement (interp, "listq_?queue_name? 

List_queue (s) \n" ) ; 
559 Tcl_AppendElement (interp, "statq ?queue name? 

Queue (s)_statistics\n") ; 

560 

Tcl_AppendElement (interp, "readq_queue_name^ 
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Return _queue_contents\n") ; 

561 

Tcl_AppendElement (interp, "get q queue name 

5 Return _and__empty_queue_contents\n") ; 

562 

Tcl_AppendElement (interp, "putq queue_naine_value 

Add_value_to_queue\n") ; 
10 563 Tcl_AppendElement (interp, "plotq_queue_name_? . . . ? 

Display_queue_contents_graphically\n" ) ; 

564 Tcl_AppendElement (interp, "scatq_queue_name_queue_nanie_ 

15 Display_queue_contents_graphically\n" ) ; 

565 Tcl_AppendElement (interp, " traceq_?tracenum_queue_name? 

Trace_writes_to_the_queue\n") ; 

566 Tcl_AppendElement (intearp, "clearq 

20 

Clears_contents_of_queues\n") ; 

567 

Tcl_AppendElement (interp, "lists_?schedule_name? 

25 List_available_schedules\n") ; 

568 

Tcl_AppendElement (interp, "runs_schedule_name_iter 

Runs_iter_iterations_of_a_schedule\n" ) ; 

30 569 

Tcl^AppendElement (interp, "vhdls_schedule_name 
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Dumps_VHDL_code_f or_a_schedule\n" ) ; 
570 Tcl_AppendElement (interp, "listp_?parameter_name? 

List_parameters\n") ; 
5 571 Tcl_AppendElement (interp, "setpjparameter_name_value 



List j>araTneters\n" ) ; 
572 Tcl_AppendElement ( interp , " readp_parameter_name 



10 Return _Variable_Contents\n" ) ; 

573 

Tcl_AppendElement (interp, "lista_?attribute_name? 



15 573 Tcl_AppendElement (interp, "seta_attribute_name_value 



Set_attribute\n" ) ; 
574 Tcl^AppendElement (interp, "lines_l/0 



575 Tcl_AppendElement (interp, "testb_l/0 



Disables_test_benches\n" ) ; 
577 return TCL_OK; 
25 578} 
579 

580 // 

// 

581 // intialization and command definition 
30 582 int Applnit (Tel _Interp *interp) { 

583 

584 if( Tcl_Init (interp) ==TCL_ERROR) 



List_attributes\n" ) ; 



20 



Turns_on/of f_line_drawing\n") ; 
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in 



585 return TCL_ERROR; 
586 

587#ifdef MAKE_WISH 

588 if (Tk_Init (interp) ==TCL_ERROR) 
5 589 return TCL_ERROR; 
590#endif 
591 

592 create_queue _hash(); 
593 

10 594 Tcl_CreateCommand (interp, "listq" , ListQueue, NULL, 
NULL) ; 

595 Tcl_CreateConunand( interp, "statq" , StatQueue, NULL, 
NULL) ; 

596 Tcl_CreateCommand( interp, "readq" ,ReadQueue, NULL, 
15 NULL) ; 

597 Tcl_CreateCoTnmand (interp, "getq" , GetQueue, NULL, 
NULL) ; 

598 Tcl_CreateCommand (interp, "putq" , PutQueue, NULL, 
NULL) ; 

20 599 Tcl_CreateCommand (interp, "plotq" , PlotQueue, NULL, 
NULL) ; 

600 Tcl_CreateCominand (interp, "scatq" , ScatQueue, NULL, 
NULL) ; 

601 Tcl_CreateCommand( interp, "traceq" ,TraceQueue, NULL, 
25 NULL) ; 

602 Tcl_CreateCommand{ interp, "clearq" , ClearQueue, NULL, 
NULL) ; 

603 

604 Tcl_CreateCommand( interp, "lists" , ListSchedule, NULL, 
30 NULL) ; 

605 Tcl_CreateCommand (interp, "runs", RunSchedule, NULL, 
NULL) ; 
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606 Tcl_CreateCommand(interp, "vhdls" , VhdlSchedule, NULL, 

NULL) ; 

607 

608 Tcl_CreateCommand(interp, "listp" , ListParameter , NULL, 
NULL) ; 

609 Tcl_CreateCommand(interp, "setp", SetParameter, NULL, 
NULL) ; 

610 Tcl_CreateCommand (interp, "readp" , ReadParaTneter,NULL, 
NULL) ; 

611 

612 Tcl_CreateComniand(interp, "lista" , ListAt tribute, NULL, 
NULL) ; 

613 Tcl_.CreateConunand(interp, "seta", SetAttribute , NULL, 
NULL) ; 

614 

615 Tcl_CreateCommand(interp, " testb" , Testbenches, NULL, 
NULL) ; 

616 Tcl_CreateCommand{interp, "lines" , SetLineStyle, NULL, 
NULL) ; 

617 Tcl_CreateCommand(interp, "OCAPI " , OCAPIHelp, NULL, 
NULL) ; 

618 

619 return TCL__OK; 

620} 

621 

622 

623 // 

// 

624 

625 interpreter & operator<<( interpreter &p, sysgen &s ) { 

626 p.add(s) ; 

627 return p; 




224 

628} 

629 . . 

630 interpreter & operator<< ( interpreter &p, elk &ck) { 

631 glbClk= &ck; 
5 632 return p; 

633} 
634 

635 void interpreter :: observe (double &v,char *name) { 

636 int present; 

10 637 Tcl_SetHashValue (Tcl_CreateHashEntry (&doubles_hash, na 
me, 

&present) , (char*) &v) ; 

638} 
639 

15 64 0 void 
interpreter: :obsAttr {Callback2wRet<int , double, int>f , int 
n, char *name) { 

641 int present; 

642 functorlist tnuTnfunctors++] =f ; 
20 643 if (numfunctors>100) { 

644 cerr<< " ***_ERROR :_max_num_f unctors_exceeded\n" ; 

645 exit(O); 

646 } 

647 Tcl_SetHashValue {Tcl_CreateHashEntry (&attr_hashf unc, n 
25 ame, 

Represent) , (char *) numf unctors-1) ; 

648 Tcl_SetHashValue (Tcl^CreateHashEntry (&attr_hashint , na 
me, 

^present) , (char *)n) ; 

30 64 9} 
650 

651 interpreter :: interpreter 0 { 
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652 Tel _InitHashTable (&sched_hash,TCL_STRING_KEYS) ; 

653 Tel _InitHashTable(&dou'bles_hash,TCL_STRING_KEYS) ; 

654 Tel _InitHashTable (&attr_hashfunc,TCL_STRING_KEYS) ; 

655 Tel _InitHashTable(&attr_hashint,TCL_STRING_KEYS) ; 



5 656 


numfunetors 


= 0; 








657 


traces [0] 




&traceO; 


tracedqueue [0] 




&nilFB 


658 


traces [1] 




fictracel ; 


tracedqueue [1] 




&nilFB 


659 


traces [2] 




&trace2 ; 


tracedqueue [2] 




&nilFB 


660 


traces [3] 




&traee3 ; 


tracedqueue [3] 




&nilFB 


10 661 


traces [4] 




&traee4 ; 


tracedqueue [4] 




&nilFB 


662 


traces [5] 




&trace5; 


tracedqueue [5] 




&nilFB, 


663 


traces [6] 




&trace6 ; 


tracedqueue [6] 




&nilFB, 


664 


traces [7] 




&trace7 ; 


tracedqueue [7] 




&nilFB, 


665} 















15 666 

667 void interpreter :: add (sysgen &s) { 

668 int present; 

669 Tcl_SetHashValue (Tcl_CreateHashEntry (&sehed_hash, s.get 
name { ) , 

20 ^present) , (char *) &s) ; 

670} 
671 

672 void interpreter: : go (intargc, char **argv) { 
673#ifdef MAKE_WISH 
25 674 Tk_Main(argc,argv, Applnit) ; 
675#else 

676 Tcl_Main(argc,argv, Applnit); 
677#endif 
678 
30 679} 
680 
681 
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5.3 driver/sys . cxx 



10 



1 // sys.cxx 

2 // All rights reserved Imec 1998 

3 // @{#)sys.cxx 1.5 98/03/31 
4 

Sttinclude "qlib.h" 
6#include "hshake.h" 
7#include "driver . h" 
8 # include "sys.h" 



9 








10 


double glbQPSK 


0. ; 


// 


11 


double glbDiff 


0. ; 


// 


12 


double glbTl 


0. ; 




13 


double glbT2 


0. ; 




14 


double glbT3 


0. ; 




15 


double glbT2 0 


0. ; 




16 


double glbNoiseLevel= 


0. ; 




17 


double glbADWbits 


10. ; 




18 


double glbADLbits 


6. ; 




19 








20 


int main(int argc, char **argv) 


21 








22 


LOADTYPES ( . . /rx/TYPEDEF) ; 




23 








24 


/ /g 1 oba 1 synchronous 


clock 




25 


clkck; 






26 








27 


// . 












28 


If 






29 


/ /byte source 
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30 // 

31 FBQ( tx _bytes ) ; 

32 pseudorn _gen GEN_RN ( "gen_rx" , 







^ 1 
J J 


tx bytes) ; 




c 


J. 
























o / 








ft 
J o 


/ / 














/ / 






ft u 


/ / ti alls lui cue jt 


5 






/ / 
/ / 








r ^ cx_ma Dye e s ; ; 








r \5\i \ cx__symDol s j ; 


: 

^ E 




AA 


jfDvV t-x axi. symDOXS/ ; 


c _ : 




ft J 


r V t.X XVaX ) f 






ft D 


row \ cx cjvax ) ; 


o 




TC / 


r \ L-A. oXy J f 




20 


4R 

Tt O 


r \ L.jt oxy cjua.nc / ; 


O 
pi 




A Q 
ft ^ 








DU 


rna Ta__rnd ( "tx_aeranam 






1 


cx Dyces, 








tx_rnd__bytes) ; 




25 


53 


tuplelize TX_TUPLE ( " tx_tuple" 






54 


tx_rnd_bytes. 






55 


tx_syinbols. 






56 


glbQPSK) ; 






57 


dif f enc TX_DIFFE ( "tx_dif f e" , 




30 


58 


tx^symbols, 






59 


tx_di f _symbol s , 






60 


glbQPSK, 



5 



1 : 
I = 
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61 glbDiff); 

62 map TX_MAP ("tx_map", 

63 tx_dif_syTnbols, 

64 tx_ival,, 
5 65 tx_qval, 

66 glbQPSK) ; 

6 7 shape TX_SHAPE ( " t x_shape " , 

68 tx_ival, 

69 tx_qval, 
10 70 tx_sig) ; 

71 ad TX_AD ( " t x_ad " , 

^ 72 tx_sig, 

%y 73 tx_sig_quant , 

Xfi 74 glbADWbits, 

15 75 glbADLbits) ; 

m 76 

^ 77 sy sgen TX ( " TX " ) ; 

Q 78 TX << TX_RND; 

g 79 TX << TX_TUPLE; 

^ 20 80 TX << TX_DIFFE; 

81 TX « TX_MAP; 

82 TX « TX_SHAPE; 

83 TX « TX_AD; 
84 

25 85 // 



86 // 

87 //channel 

88 // 

30 89 FBQ{ chan_isi); 

9 0 FBQ ( chan_ou t ) ; 
91 



i 




□ 

104 
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92 fir CHAN_FIR{"chan_fir", 

93 tx_sig_quant ,* 

94 chan_isi, 

95 glbTl, 
5 96 glbT2, 

97 glbTS, 

98 glbT20) ; 
99 

100 noise CHAN_NOISE ( "chan_noise" , 

10 101 chan_isi, 

102 chan_out , 

103 glbNoiseLevel) ; 



105 sysgen CHAN ( "CHAN" ) ; 

15 106 CHAN << CHAN^FIR; 

107 CHAN << CHAN_NOISE; 
108 

109 // 



20 110 // 

111 //receiver 

112 // 

113 FBQ (rx_constel_mode) ; 

114 FBQ ( rx_lms_i ) ; 
25 115 FBQ(rx_lms_q) ; 

116 FBQ(rx_symtype) ; 

117 Imsff RX_LMSFF{"lmsff 

118 ck, 

119 rx_constel_mode, 
30 120 chan_out, 

121 

122 rx_lms_i. 
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123 rx_lms_q, 

124 rx_syTntype 

125 ) ; 
126 

5 127 RX_LMSFF.setAttr (Imsf f : : FWLENGTH, 8 ) 
128 RX_LMSFF.setAttr (Imsf f : : STEP_PAR, 4 ) 
12 9 RX_LMSFF.setAttr (lmsff::PO, -0.2*2.0) 

130 RX_IiMSFF.setAttr (Imsf f : :P1, 0.7*2.0) 

131 RX_LMSFF.setAttr (lmsff::P2, 0.7*2.0) 
10 132 RX_LMSFF.setAttr (lmsff::P3, -0.2*2.0) 

133 RX_LMSFF.setAttr (Imsf f : :REF, 3.0 ) 

134 RX_LMSFF.setAttr (Imsf f : : INIT ) 

135 RX_IiMSFF.setAttr (Imsf f : : SPS_PAR, 4 ) 
136 

15 137 FBQ (rx__symtype_at) ; 

138 FBQ( rx_dif f_mode) ; 

139 FBQ (rx_symbol) ; 

140 demap RX_DEMAP ( " demap " , 

141 ck, 

20 142 rx_syTntype , 

143 r x_d i f f _mode , 

144 rx_lms_i, 

145 rx_lms__q, 
146 

25 147 rx_symtype_at , 

148 rx__symbol 

149 ) ; 
150 

151 RX_DEMAP.setAttr (demap : rDEBUGMODE, 0) ; 
30 152 RX_DEMAP.setAttr (demap: :REF, 3 . 0) ; 
153 

154 FBQ( rx_syncro) ; 
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155 FBQ( rx_byte _rnd) ; 

156 detupleRX^DETUPLE ( "detuple" , 

157 ck, 

158 rx^symbol , 

5 159 rx_syTntype _at, 

160 

161 rx_byte _rnd, 

162 rx_syncro 

163 ) ; 
10 164 

165 RX_DETUPLE.setAttr (detuple :D: EBUGMODE, 0) ; 
166 

167 FBQ ( rx_by t e_ou t ) ; 

168 FBQ( rx_sync_out) ; 

15 169 derandRX_DERAND ( "derand" , 

170 ck, 

171 rx_by t e_r nd , 

172 rx_syncro, 
173 

20 174 rx_byte_out, 

175 rx_sync_out 

176 ) ; 
177 

178 RX_DERAND.setAttr (derand : : DEBUGMODE, 0 ) 
25 179 RX_DERAND.setAttr (derand :: SEED, 0x3f ) ; 

180 RX_DERAND.setAttr (derand: : BYPASS, 0 ) 
181 

182 sysgen RX_UT ( "RX_UT" ) ; 

183 RX__UT << RX_IiMSFF; 
30 184 RX_IJT << RX_DEMAP; 

185 RX_UT << RX_DETUPLE; 

186 RX_UT << RX_DERAND; 
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187 

188 // ^--clocktrue definition 

189 handshake hskl ( "hi" , ck) ; 

190 handshake hsk2 ( "h2 " , ck) ; 
5 191 handshake hsk3 ( " h3 " , ck ) ; 

192 

193 rx_lms_i . sethandshake (hskl) ; 

194 rx_symbol . sethandshake (hsk2 ) ; 

195 rx_byte_rnd . sethandshake (hsk3 ) ; 
10 196 

1 197 RX_LMSFF .defineO; 

^ 198 RX_DEMAP .defineO; 

5 199 RX_DETUPLE.define() ; 

: ^ 200 RX_J)ERA^fD .defineO; 

. bj 15 201 

fp! 202 sysgen RX TI("RX TI"); 

203 RX_TI << RX_LMSFF .fsmO; 
; h 204 RX_TI << RX_DEMAP .fsm{); 

i ^ 205 RX_TI << RX_DETUPLE.fsm() ; 

; =p 20 206 RX_TI << RX_DERAND .fsm{); 

' 2 207 

208 // iopad definition 



209 


dfix T_byte (0, 8, 0) ; 




210 


RX_TI . inpad (chan^out , 


T(T_saTnple_lms) ) ; 


25 211 


RX_TI . inpad { rx_di f f _mode , 


T_bit) ; 


212 


RX_TI . inpad (rx_constel_mode 


,T_bit) ; 


213 


RX_TI . outpad ( rx_byt e_out , 


T_byte) ; 


214 


RX_TI . outpad ( rx_sync_out , 


T_bit) ; 


215 







30 216 // insert clear registersstate 

217 RX_LMSFF . f sm ( ) . clear_regs ( ) ; 

218 RX_DEMAP . f sm ( ) . clear_regs ( ) ; 



i. 



m 
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219 RX_DETUPLE.fsm() .clear_regs{) ; 

220 RX__DERAND . f sm ( ) . clear^regs ( ) ; 
221 

222 // testbench generator for this clocktrue model 

5 223 RX_IiMSFF .fsm().tb _enable() 

224 RX_DEMAP .fsmO.tb _enable() 

225 RX_DETUPLE.fsm() . tb _enable() 

226 RX_DERAND .fsm().tb _enable() 

227 RX_TI .tb _enable() 
10 228 RX_TI .generateO; 

229 

230 // 



231 
15 232 
233 
234 
235 
236 
20 237 
238 
239 
240 
241 
25 242 
243 
244 
245 
246 
30 247 
248 
249 



II 

//interpreter 
II 

interpreter P; 
P << GEN; 
P << TX; 
P << CHAN; 
P << RX_UT; 
P « RX_TI; 
P << ck; 

P . observe (glbQPSK 
P . observe (glbTl 
P . observe (glbT2 
P . observe (glbT3 
P. observe (glbT20 



, "QPSK" 
, "Tl" 
, "T2" 
, "T3" 
, "T20" 



P. observe (glbNoiseLevel , "NoiseLevel 
P. observe (glbADWbits , "ADWbits" 
P. observe (glbADLbits , "ADLbits" 
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250 P. observe (glbDiff , "Dif fEnc" ) ; 
251 

252 P.ATTRIBUTEdmsf f , RX_LMSFF , FWLENGTH , Imsf f_f wlen) ; 

253 P.ATTRIBUTEdmsff ,RX_LMSFF , STEP_PAR ,lmsff_step) ; 
5 254 P.ATTRIBUTEdmsff ,RX_LMSFF , PO , Imsf f_pO ) ; 

255 P.ATTRIBUTEdmsff ,RX_LMSFF , PI , Imsf f jpl ) ; 

256 P.ATTRIBUTEdmsff ,RX_LMSFF , P2 , Imsf f_p2 ) ; 

257 P.ATTRIBUTEdmsff ,RX_LMSFF , P3 , Imsf f_p3 ) ; 

258 P.ATTRIBUTEdmsff ,RX_LMSFF , INIT ,lmsff_init) ; 
10 259 P.ATTRIBUTE(derand,RX_DERAND , SEED , derand_seed) ; 

260 P. ATTRIBUTE (derand,RX_DERAND , BYPASS , derand^bypass) 
261 

262 P.go(argc,argv) ; 
263 
15 264} 
265 



5 . 4 driver/ sys . h 

20 l#infdef SYS_H 
2#define SYS_H 
3 
4 

5 // ®(#)sys.h 1.3 98/03/27 
25 6 

7#include "Callback2wRet .h" 
8 

9#define ATTRIBUTE (CLASS, INST, PARM, NAME) \ 

10 obsAttr (make_callback( (Callback2wRet<int , double, int>0 
30 *) , 

&INST, CLASS: :setAttr) , CLASS : :PARM,#NAME) 

11 
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12 

13 . // 

P.obsAttr (mak:e_callback( (Callback2wRet<int , double, int> *) 0, 

5 &RX_LMSFF, Imsf f : :setAttr) ,lmsff : :FWLENGTH, " Imsf f_f wlen" ) ; 
14 

15#define DEBUGQ{A) FBQ(A) ;FBQ(db_##A) ; A. asDup (db_##A) ; 
16 

17#include " . . /tx/rnd.h" 

10 18#include " , . /tx/tuplelize .h" 

19#include " . . /tx/dif f enc . h" 

2 0#include " . . /tx/map . h" 

21#include " . . /tx/shape.h" 

22#include " . ./tx/ad.h" 

15 23#include " . . /chan/f ir , h" 

24#include " . . /chan/noise . h" 

25#include " . . /rx/lmsf f .h" 

26#include " . . /rx/demap. h" 

27#include " . . /rx/detuple.h" 

20 28#include " . ./rx/derand.h" 

29 

30#endif 



25 6 Receiver Code 



6 . 1 rx/ demap . h 
30 1// 



2 1 1 COPYRIGHT 



4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 

6 // 

7 // All rights reserved. 

8 // 

9// 1,... 

10 // Module: 

11 // MAP 

12 // 

13 // Purpose: 

14 // Mapping of QAM16/QPSK constellations to symbols 
®(#)demap.h 

1.5 98/03/30 

15 // 

16 // Author: 

17 // Patrick Schaumont/ Radim Cmar 

18// . 

19 

20#infdef DEMAP_H 
21#define DEMAP_H 
22 

23#include "qlib.h" 
24#ifdef I2C 

25#include " i2c_master , h" 
26#include "i2c_slave .h" 
27#endif 

28#include "macros. h" 
29#include "typedef ine.h" 
30 
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31 classdemap : public base{ 

32 public: 
33 

34 clk& _ck; 
5 35#ifdef I2C 

3 6 i2c_slave _slave; 
37#endif 

38 PRT(symtype_in) ; 

39 PRT(diff_mode) ; 
10 40 PRT(i_in); 

41 PRT(q_in); 

4 2 PRT ( symt ype_out ) ; 
4 3 PRT ( symbol_out ) ; 
44 ctlfsm _fsm; 

15 45 

4 6 public: 



47 enum {DEBUGMODE, REF} ; 

48 enum {QAM16, QPSK} ; 

4 9 intdebug_mode ; 
20 50 double ref; 

51 

52 demap(char *name, 

53 clk& elk, 

54 _PRT(symtype_in) , 
25 55 _PRT(dif f_mode) , 

56 _PRT(i_in) , 

57 _PRT{q_in) , 

5 8 _PRT ( symt ype_out ) , 
59 _PRT(symbol_out) ) ; 

30 60 

61 "demapO ; 

62 int setAttr (intAttr, double v=0) 
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63 int decide (dfix constel, dfixest) ; 

64 int run() ; 

65 void defineO; 

66 ctlfsm & fsm() ; 
5 67#ifdef I2C 

68 i2c_slave&slave 0 ; 

69#endif 

70 

71 }; 
10 72 

73#endif 

6 . 2 rx/ demap . cxx 

15 111 - 

2 1 1 COPYRIGHT 

3 // ========= 

4 // 

20 5 // Copyrightl996 IMEC, Leuven, Belgium 
6 // 

111 Allrights reserved. 
8 // 

9//--- 

25 

10 // Module: 

11 // MAP 

12 // 

13 // Purpose: 

30 14 // Mapping of QAM16/QPSKconstellations to symbols 

®(#) demap. cxx 1.8 98/0* 
*4/07 
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15 // 

16 // Author: 

17 // Radim Cmar 

18// 

5 

19 
20 

21#include "demap.h" 
22#include " trans. h" 
10 23 

24 // QAM16 

25 static int vIQMapl6[4l [4] = { 

26 { 15,14, 10, 11), 

27 { 13,12, 8, 9), 
15 28 { .5 , 4, 0, 2}, 

29 {7,6, 1, 3}}; 
30 

31 // QPSK 

32 static int vIQMap4 [2] [2] = { 
20 33 { 3,2}, {1, 0}},- 

34 

3 5 demap : : demap ( char *name , 



36 


clk& elk. 


37 


_PRT{symtype_in) , 


25 38 


_PRT(diff_mode)., 


39 


_PRT(i_in) , 


40 


_PRT(q_in) , 


41 


_PRT ( symtype_out ) , 


42 


_PRT ( symbol^out ) 


30 43 ) 


: base (name) , 


44 _ck(clk) , 




45#ifdef I2C 
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46 _slave (strapp (name, "_i2c_host") ) , 
47#endif 

48 IS_SIG(symtype_in,T_bit) , 

49 IS_SIG(diff_mode,T_bit) , 
5 50 IS_SIG(i_in,T_float) , 

51 IS_SIG(q_in,T_float) , 

52 IS_REG(symtype_out,_ck, T_bit) , 

53 IS_REG{syTnbol_out,_ck, T_float) 

54 { 

10 55 IS _IP(symtype_in) ; 

56 IS _IP(diff_mode) ; 

57 IS _IP(i_in) ; 

58 IS _IP(cL_in) ; 

59 I S_OP ( symtype_out ) ; 
15 60 IS_OP(symbol_out) ; 

61 

62 debug_mode= 0; 

63 } 
64 

20 65 demap : : "demap ( ) { 
66 } 
67 

68 int demap: :setAttr(intAttr, double v) 

69 switch (Attr) { 
25 70 case REF: 

71 ref= v; break; 

72 case DEBUGMODE: 

73 debug_mode = (int) v; break; 

74 } 

30 75 return 1; 
76 } 
77 
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78// 

79 

80 int demap : : run ( ) { 
5 81 

82 int thissym; 

83 int ik, qk; 

84 int n_ik,n_qk; 

85 static int ik_at= 1; 
10 86 static int qk_at= 1; 

87 

88 if( (FBID(i_in) .getSizeO <1) | | 

89 (FBID(q_in) .getSize 0 <1) | | 

90 (FBID(symtype_in) .getSize {) <!) | | 
15 91 (FBID(diff_mode) .getSizeO <1) 

92 ) 

93 return 0 ; 
94 

95 dfix vi = FBID (i_in) .get 0 ; 
20 96 dfix vq - FBID (q_in) . get () ; 

97 dfix constel = FBID (symtype_in) .get () ; 

98 dfix diffdec= FBID (dif f_mode) .getlndex(O) ; 
99 

100 int indi = decide (constel ,vi) ; 
25 101 int indq = decide (constel, vq) ; 
102 

103 if ( constel== QAM16) { 

104 thissym= vIQMapl6 [indi] [indq] ; 

105 } else{ 

30 106 thissym= vIQMap4 [indi] [indq] ; 

107 } 

108 int thissymO = thissym; 
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109 
110 

111 if( diffdec== 1) { 

112 if (constel == QAM16) { 

5 113 ik = (thissym>> 3) &1; 

114 qk = (thissym>> 2) &1; 

115 n_ik 
( ( (" (ik^qk) )£c(ik^ik_at) ) | { (ik^qk) & (qk^qk^at) ) )&1; 

116 n_qk 
10 ( ( (" (ik'qk) ) &(qk"qk_at) ) | ( (ik'qk) & (ik"ik_at ) ) ) &1; 

117 ik_at= ik; 

118 qk_at= qk; 

119 thissym = {n_ik<<3 ) + (n_qk<< 2) + (thissym 
3) ; 

15 120 

121 } else { 

122 ik = (thissym» 1) &1; 

123 qk = (thissym ) & 1; 

124 n_ik= 
20 ( ( {" (ik'qk) ) 6c(ik^ik_at) ) | ( (ik'qk) & {qk^qk_at ) ) )&1; 

12 5 n_qk: 
( ( ( " (ik^qk) ) & (qk^qk_at ) ) | ( (ik^qk) & (ik'ik^at) ) ) &1 ; 

126 ik_at= ik; 

127 qk_at= qk; 

25 128 thissym = (n_ik«l ) + {n_qk ) ; 

129 } 

130 } 
131 

132 i f ( debug_mode ) 

30 133 cout<< "_constel :_"<<constel 

134 << "_i:_"<<vi 

135 << "_q:_"<<vq 
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<< 


"_qk:_"<<qk 


139 


<< 


"_n_ik : _" <<n_ik 
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141 


<< 


"_thissym:_"<<thissym<<endl ; 


142 







143 FBID(syTnbol_out) << (thissym) ; 

144 FBID (symtype_out) << (constel) ; 
10 145 

146 return 1; 

147} 

148 

149 int demap: : decide (dfix constel, dfix est) { 
15 150 double c = ref/3; 

151 if (constel== QAM16) { 



152 if (est > dfix(2*c)) 

153 return 3; 

154 else if (est > dfix(O)) 
20 155 return 2; 

156 else if (est > dfix(-2*c)) 

157 return 1; 

158 else 

159 return 0; 
25 160 } else{ 

161 if (est > dfix(0, ) ) 

162 return 1; 

163 else 

164 return 0; 
30 165 } 

166} 
167 
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168// 

169 

170 ctlfsm Sc demap: : f sm() { 
5 171 return _fsm; 
172} 
173 

174#ifdef I2C 

175 i2c_slave & demap: : slave () { 
10 176 return _slave; 
177} 

178#endif 
179 

180 void demap: : define 0 { 
15 181 int i; 
182 

183 df ixT_2bit (0,2, 0,dfix: :tc) ; 

184 df ixT_cnt (0, 3, 0,dfix: :ns) ; // symbol counter upto 
4 

20 185 dfixT_symb(0,4, 0,dfix: :ns) ; // symbol type 0..15 
186 

187 PORT_TYPE(i_in,T{T_sample_demap) ) ;//user type 

188 PORT_TYPE(q_in,T{T_sample_demap) );//user type 

189 PORT_TYPE (symbol_out , T_symb) ; 
25 190 

191 FSM(_fsm) ; 

192 INITIAL (rst) ; 

193 STATE (phasel) ; 

194 STATE (phase2) ; 
30 195 STATE (phase3) ; 

196 

197 SIGCK(constelqam, _ck, T_bit) ; 
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198 SIGCK(dif fdecod, _ck, T_bit) ; 

199 SIGCK(i_inp,_ck, T (T_sample_demap) ) ; 

200 SIGCK(q_inp,_ck, T (T_satnple_demap) ) ; 

201 SIGW(indi, T_2bit) ; 

202 SIGW(indq, T_2bit) ; 

203 SIGCK(start_frame,_ck, T_bit) ; 

204 _sigarraymapsl6 { "maps" , 16, &_ck, T_syTnb) 
2 05 _sigarraymaps4 ( "maps" , 4 , &_ck, T_symb) ; 

206 SIGW(symbO, T_symb) ; 

207 SIGW(symbl, T_symb) ; 

208 SIGW{ik, T_bit) ; 

209 SIGW(qk, T_bit) ; 

210 SIGWdk _l,T_bit) ; 

211 SIGW(qk_l, T_bit) ; 

212 SIGCK(ik_at,_ck, T_bit) ; 

213 SlGCK(qk_at,_ck, T_bit) ; 

214 SIGW(ak, T_bit) ; 

215 SIGW(bk, T_bit) ; 
216 

217#ifdef I2C 

218 for(i = 0; i < 16; i++) 

219 _slave.put (&mapsl6 [i] ) ; 

220 for(i = 0; i < 4; i++) 

221 _slave,put (&maps4 [i] ) ; 
222#endif 

223 
224 

225 SFG(demap_allways) ; 

226 GET{dif f^mode) ; 

227 diffdecod= diff_mode; 
228 
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230 SFG (demap_reset) ; 

231 for(i = 0; i < 16; i+'f) 

232 mapslGti] = W (T_synib, vIQMapl6 [i/4] [i%4] ) ; 

233 for(i = 0; i < 4; i++) 

234 maps4[i] = W(T_syTnb, vIQMap4 [i/2] [i%2] ) ; 
235 

236 setv ( s t art_f rame , 0 ) ; 

237 setv(ik_at, 0) ; 

238 setv(qk__at , 0) ; 
239 

240 

241 S FG ( demap^qaml 6 ) ; 

242 double c = ref/3; 

243 indi= (i_inp<= C (i_inp, -2*c) ) c .assign (C (indi , 0) , 

244 (i_inp<= C(i_inp,0.0) ) c . assign (C (indi , 1) , 
245 

{i_inp<=C{i_inp,+2*c) ) c .assign (C (indi , 2) ,C(indi,3) ) ) ) ; 
246 

247 indq= (q_inp<= C (q^inp, -2*c) ) c . assign (C (indq, 0) , 

248 (q_inp<= C(q_inp,0.0) ) c. assign (C(indq, 1) , 
249 

(qLinp<=C(q_inp,+2*c) ) c . assign (C (indq, 2) ,C{indq,3) ) ) ) ; 

250 

251 

symb0=( (indi==W(T_2bit, 0) ) & (indq==W (T_2bit , 0) ) ) . cassign (mapsl6 [ 

0] , 

252 

( (indi==W(T_2bit, 0) ) &(indq==W(T_2bit , 1) ) ) . cassign (mapsl6 (1] , 

253 ( (indi==W(T_2bit,0) )&(indq==W(T_2bit,2) ) ) . cassign (maps 
16 [2], 

254 ( (indi==W(T_2bit,0) )&(indq==W(T_2bit,3) ) ) . cassign (maps 
16(3] , 
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255 

( (indi==W{T_2bit, 1) ) &{indq==W (T_2bit, 0) ) ) . cassign (mapslG [4] 
256 

( (indi==W(T_2bit, 1) ) &(indq==W(T_2bit, 1) ) ) . cassign (mapsl6 [5] 

257 

( (indi==W(T_2bit, 1) ) &(indq==W(T_2bit, 2) ) ) . cassign (mapslS [6] 
258 

( (indi==W{T_2bit, 1) ) & (indq==W (T_2bit , 3 ) ) ) . cassign (mapslS [7] 
259 

( (indi==W(T_2bit,2) ) & (indq==W (T_2bit , 0) ) ) , cassign (mapslS [8] 
2S0 

( (indi==W(T_2bit,2) ) &(indq==W(T_2bit, 1) ) ) . cassign (mapslS [9] 
2S1 

( (indi==W (T_2bit,2) ) & (indq==W (T_2bit , 2 ) ) ) . cassign (mapslS [10 

] , 

2S2 

( (indi==W(T_2bit,2) ) &(indq==W(T_2bit , 3) ) ) . cassign (mapslS [11 

] , 

2S3 

( (indi==W(T_2bit,3) ) & (indq==W (T_2bit , 0) ) ) . cassign (mapslS [12 

], 

2S4 

( (indi==W(T_2bit,3) ) & (indq==W (T_2bit , 1) ) ) . cassign (mapslS [13 
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265 

( (indi==W(T_2bit,3) ) & (indq==W (T__2bit , 2) ) ) , cassign (mapsl6 [14 

] , 

266 

5 mapsl6 [15] 
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ak = (("(ik * qk)) & (ik " ik_l) ) | ( (ik^qk) & 
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bk = ({"(ik ^ qk)) & (qk ^ qk_l) ) | ((ik^qk) & 
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279 


symbl = (symbO &W (T_symb,3) ) | 




G 




280 


( (cast {T_syinb,ak) <<W {T_syTnb, 3) ) 








(T_symb,8) ) | 








281 


( (cast (T_symb,bk) <<W (T_symb, 2) ) 


&W 






(T_symb,4) ) ; 
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282 


syTnbol_out= (diffdecod) . cassign( symbl, symbO) ; 





283 
284 

285 SFG(demap_qpsk) ; 

286 indi= (i_inp< C(i_inp,0) 
30 )c,assign(C(indi,0) ,C(indi,l) ) ; 

287 indq= (q_inp< C(q_inp,0) 
) c. assign (C (indq, 0) ,C(indq,l) ) ; 
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288 

289 symbO= { (indi==W (T_2bit , 0) ) &(indq^=W (T_2bit, 0) ) ) 
. cassign (maps4 [0] , 

290 

5 ( (indi-=W(T_2bit,0) ) & (indq==W (T_2bit , 1) ) ) . cassign (maps4 tU , 
291 

{ (indi==W(T_2bit, 1) ) & (indq==W (T_2bit , 0) ) ) . cassign (Tnaps4 [2] , 
292 

maps4 [3] 
10 293 ) ) ) ; 
294 

295 ik_l= (start_frame) . cassign (W (T__bit, 0) ,ik_at) ; 

296 qk_l= (start_f rame) , cassign (W {T_bit, 0) ,qk__at) ; 
297 

15 298 ik= cast (T_bit,symbO» W(T_bit,l) ) ; 

299 qk = cast (T_bit, symbO) ; 

300 ak = (("(ik ^ qk) ) & (ik^ ik_l) ) | ((ik'qk) & (qk" 
qk_l)) ; 

301 bk = (("(ik " qk)) & (qk" qk_l) ) | ( (ik" qk) & (ik" 
20 ik_l)); 

302 ik_at=ik; 

303 qk_at=qk; 
304 

305 symbl = { (cast (T_symb, ak) <<W (T^symb, 1) ) &W 
25 (T_symb,2) ) | 

306 (cast {T_syTnb,bk) &W (T_syTnb, 1) ) ; 

307 symbol_out= (diffdecod) . cassign { symbl, symbO) ; 
308 

309 

30 310 SFG(demap_in) ; 

311 GET(i_in) ; 

312 GET(q_in) ; 
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313 GET(symtype_in) ; 

314 i_inp=i__in; 

315 q_inp=q_in; 

316 constelqam= "symtype_in; 
5 317 symtype_out= symtype_in; 

318 

319 SFG ( demap_out ) ; 

320 PUT(syTnbol_out) ; 

321 PUT ( symt ype_out ) ; 
10 322 

323 

324 // 

325 

326 DEFAULTDO{demap_allways) ; 
15 327 AT (rst) ALLWAYS 

328 DO ( demap_r e s e t ) 

329 GOTO{phasel) ; 
330 

331 AT (phasel) ALLWAYS 
20 332 DO(demap_in) 
333 G0T0(phase2) ; 



335 AT (phase2)ON (_cnd (constelqam) ) 

336 DO ( demap_qaml 6 ) 
25 337 GOTO(phase3) ; 

338 

339 AT (phase2)ON ( l_cnd (constelqam) ) 

340 DO(demap_qpsk) 

341 GOTO{phase3) ; 
30 342 

343 AT (phase3) ALLWAYS 

344 DO ( deinap_ou t ) 



334 
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345 GOTO(phasel) ; 

346 

347 

348#ifdef I2C 

34 9 _slave . attach (_fsm, *state j)hase2 ,_ck) ; 

350#endif 

351 

352 _fsTn. set info (verbose) ; 

353 ofstream FO ( "demap_transO .dot " ) ; 

354 FO<<_fsm; 

355 FO .close {) ; 
356 

357 transform TRANSF (_f sm) ; 

358 TRANSF,fsm_handshakel(_ck) ; 
359 

360 ofstream F("demap__trans.dot") ; 

361 F << _fsm; 

362 F .close () ; 

363 _fsm, setinfo (silent) ; 
364 

365 FSMEXP(typeName( ) ) ; 

366} 

367 

6.3 rx/derand.h 

1// 

2 // COPYRIGHT 

3 // ========= 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 
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-4=1 

^1 



6 // 

7 // All rights reserved. 

8 // 

9// 



10 // Module: 

11 // PRBS 

12 // 

13 // Purpose: 

10 14 // De- randomises data usinga 6-bit or 15-bit 

15 // Pseudo Random Binary Sequence. ® (#) derand.hl . 2 
98/03/30 

16 // 

17 // Author: 
15 18 // r cmar 

19 // 

20// . 

21 

20 22#include "qlib.h" 

23#ifdef I2C 

24#include "i2c_master .h" 

25#include "i2 c_s 1 a ve . h " 

26#endif 
25 27#include "macros. h" 

28#include "typedef ine.h" 

29 

30#infdef DERAND_H 
31#define DERAND_H 
30 32 

33 class derand : public base 

34 { 
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35 

36 public: 

37 elk Sc _ck; 
38#ifdef I2C 

5 3 9 i2c_slave _slave; 
40#endif 

41 PRT(byte_in) ; 

42 PRT(syncro) ; 

43 PRT(byte_out) ; 
10 44 PRT(sync_out) ; 

45 ctlfsm_fsm; 
46 

47 enum {SEED, BYPASS, DEBUGMODE} ; 
48 

15 4 9 derand{char *name, 

50 clk& elk, 

51 _PRT(byte_in) , 

52 _PRT (syncro) , 

53 _PRT(byte_out) , 
20 54 _PRT ( sync_ou t ) 

55 ) ; 
56 

57 setAttr(int Attr, double v=0) ; 

58 int run() ; 

25 59 void defineO; 

60 ctlfsm & fsm{) ; 

61#ifdef I2C 

62 i2c_slave &slave(); 

63#endif 
30 64 

65 public: 

66 int bypass; 
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67 int seed; 

68 int debug; 

69 }; 
70 

5 71#endif 

6.4 rx/derand.cxx 

1// --_ 

10 

2 // COPYRIGHT 

3 // ========= 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 
15 6 // 

111 Allrights reserved. 
8 // 

SI I 

20 10 // Module: 

11 // PRBS 

12 // 

12 I I Purpose: 

14 // De- randomises data usinga 6-bit or 15-bit 
25 15 // Pseudo Random Binary Sequence.® (#) derand.cxxl . 8 

98/04/07 

16 // 

17 // Authors: 

18 // r cmar 
30 19 // 

20// 
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21 

22#include "derand.h" 
23#include " trans. h" 
24 

5 25 derand: :derand(char *name, 

26 clk& elk, 

27 _PRT(byte_in) , 

28 _PRT(syncro) , 

2 9 _PRT (byte_out ) , 
10 30 _PRT ( sync_ou t ) 

31 ) : base (name) , 

32 _ck(clk) , 
33#ifdef I2C 

34 _slave (strapp (name, "_i2c_host") ) , 
15 35#endif 

36 IS_SIG(byte_in,T_8bit) , 

37 IS__SIG(syncro,T_bit) , 

38 IS_REG(byte_out,clk,T_8bit) , 

3 9 IS_REG (sync_out,clk,T_bit) 
20 40 { 

41 IS_IP (byte_in) ; 

42 IS_IP (syncro) ; 

43 IS_OP(byte_out) ; 

44 I S_OP ( sync_ou t ) ; 
25 45 

46 bypass= 0; 

47 seed= 0x3 f; 

48 debug= 0; 

49 } 
30 50 

51// 
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52 

53 int derand: : setAttr (int Attr, double v) 

54 { 

55 switch (Attr) 
5 56 { 

57 case SEED: 

58 seed= (int)v; break; 

59 case BYPASS: 

60 bypass = (int)v; break; 
10 61 case DEBUGMODE: 

62 debug = (int)v; break; 

63 } 

64 return 1; 

65 } 
15 66 

67// 

68 

69 int derand: : run 0 
20 70 { 

71 static unsigned shiftreg= 0; 
72 

73 ttdefine BiT(k, n) ( (k>> (n-1)) & 1) 

74 #define MaSK(k, n) (k & ((1« (n+l))-l)) 
25 75 

76 if ( (FBID(byte_in) .getSize 0 
<1) I I F{BID(syncro) .getSize ( ) <1) ) 

77 return 0; 
78 

30 79 dfix data _in=FBID (byte _in).get(); 
80 dfix sync = FBID (syncro) . get ( ) ; 
81 
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82 unsigned data = unsigned (data_in.Val ( ) ) ; 
83 

84 if (bypass == 0) { 
85 

86 if (sync == dfix(l)) 

87 shiftreg= seed; 
88 

89 unsigned mask = 0; 

90 int xbit; 

91 for (int k=:0; k<8; k++) { 

92 xbit = BiT (shiftreg, 5) ^ BiT (shif treg, 6 ); 

93 shiftreg= MaSK(xbit | (shiftreg<< 1) ,6); 

94 mask = (mask<< 1) |xbit; 

95 } 
96 

97 data ^= mask; 

98 } 
99 

100 FBID(byte_out) «dfix ( (double) (data) ) ; 

101 return 1; 
102} 

103 

104// 



105 

106 ctlfsm Ec derand: : f sm() { 

107 return _fsm; 
108} 

109 

110#ifdef I2C 

111 i2c_slave & derand: : slave ( ) { 

112 return _slave; 
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113} 

114#endif 
115 

116 void derand: : define 0 { 
5 117 

118 dfix T_byte (0, 8, 0,df ix: :ns) ; 

119 dfix T_sreg (0, 16, 0,df ix: :ns) ; 

120 dfix T_num(0,4, 0,df ix: :ns) ; // to express constants 
0. .15 

10 121 

122 PORT_TYPE(byte_in,T_byte) ; // 8 bits 
_^ 123 PORT_TYPE(byte_out,T_byte) ; // 8 bits 

5 124 

^ 125 SIGW(mask, T_byte) ; // 8 bits 

y 15 126 SIGCK(shiftreg, _ck, T_sreg) ; // 16 bits 

^ 127 SIGCK(seed, _ck, T_sreg) ; // 16 bits 

W 128 SIGCK(bypass, _ck, TjDit) ; 

Q 129 _sigarray xbits ("xbits" , 8+1, T_bit) ; 

m 

^ 130 _sigarray shifts ( "shifts" , 8+1, T_sreg) ; 

=p 20 131 __sigarray masks ( "masks" , 8 + 1 , T_byte) ; 

2 132 

133#ifdef I2C 

134 _slave.put (6cseed) ; 

135 _slave.put (&bypass) ; 
25 136#endif 

137 

138 FSM( _fsm) ; 

139 INITIAL (rst ) ; 

140 STATE (phasel) ; 
30 141 STATE {phase2 ) ; 

142 

143 SFG( rnd_reset) ; 
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144 byte_out=W(T_byte, 0) ; 

145 seed = W (T_sreg, 0x3f ) ; 

146 sync_out=W(T_bit, 0) ; 

147 bypass = W(T_bit,0); 
5 148 shiftreg= W{T_sreg,0); 

149 
150 

151 SFG(rnd_read) ; 

152 GET(byte_in) ; 
10 153 GET(syncro); 

.154 
155 

156 #define BIT(s,k) cast (T_bit , s>> W (T_num, k-1) ) 

157 #define MASK(s,n) (s& W (T_sreg, (1<< (n+l))-l)) 
15 158 

159 SFG(rndjprbs6) ; 
160 

161 shifts [0]= (syncro=*=W 

(T_bit, 1) ) .cassign(seed, shif treg) ; 
20 162 

163 masks[0] =W (T_byte, 0) ; 

164 for(int k=0; k<8; k++) { 

165 xbits[k] = BIT{shifkt]s,5) ^BIT (shif ts [k] , 6) ; 

166 shifts [k+1] =MASK( (cast (T_sreg, xbits [k] ) &W{T_sreg, 1) ) | 
25 shifts (k]W<<(T_num,l) ) ,6) ; 

167 masks [k+1] = (masks [k] <<W(T_byte,l) ) | 
(cast (T_byte,xbits [k] ) &W(T_byte, 1) ) ; 

168 } 

169 shiftreg= shif ts [8] ; 
30 170 mask = masks [8] ; 

171 

172 byte_out= (bypass) .cassign (byte_in, byte_in^mask) ; 
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173 sync_out=W(T_bit, 1) ; 

174 

175 

176 SFG( rnd _write) ; 
5 177 PUT(byte_out) ; 

178 PUT ( sync_ou t ) ; 

179 sync_out=W(T_bit, 0) ; 
180 

181 

10 182// 

183 

184 AT (rst)ALLWAYS 

185 D0( rnd_reset) 
15 186 GOTO(phasel) ; 

187 

188 AT (phasel) ALLWAYS //state << cond <<sfg <<sfg << 
state 

189 DO(rnd_read) //phasel<<allways<<rnd_read <<rndjprb6<< 
20 phase2 

190 D0(rnd_j>rbs6) 

191 G0T0(phase2) ; 
192 

193 AT (phase2) ALLWAYS 
25 194 DO rnd_write) 
195 GOTO (phasel) ; 
196 

197#ifdef I2C 

198 _slave. attach (_fsm, *state_phase2 ,_ck:) ; 
30 199#endif 
200 

201 _fsm. setinfo (verbose) ; 
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202 ofstream FO ("derand_transO .dot" ) ; 

203 F0<< _fsm; 

204 FO.closeO ; 
205 

5 206 transform TRANS F (_fsm) ; 
207 TRANSF.fsm_handshakel (_ck:) ; 
208 

209 ofstream F{"derand_trans,dot") ; 

210 F << __fsm; 
10 211 F .closeO ; 

212 _f sm. setinfo (silent) ; 
213 

214 FSMEXP(typeName{ ) ) ; 
215} 
15 216 

6.5 rx/detuple.h 

1 // 

20 

2 // COPYRIGHT 

3 // 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 
25 6 // 

7 // All rights reserved. 

8 // 

9 // 



30 



10 // Module: 

11 // TUPLE 

12 // 
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13 // Purpose: 

14 //header detection + detuplelization ® (#) detuple . h 1-. 2 
8/03/30 

15 // 

5 16 // Author: 

17 // Radim Cmar 

18// 

19 

10 20#infdef DETUPLE__H 
21#define DETUPLE_H 
22 

23#include "qlib.h" 
24#include "macros . h" 
15 25#include " typedef ine .h" 
26 

27 class detuple : public base{ 

28 public: 
29 

20 3 0 clk& _ck; 

31 PRT ( symbol ) ; 
3 2 PRT ( symtype ) ; 

33 PRT (byte) ; 

34 PRT(syncro) ; 
25 35 ctlfsm_fsm; 

36 

37 int debug_mode; 
38 

39 public: 
30 40 enum {DEBUGMODE}; 
41 enum {QAM16, QPSK} ; 
42 
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43 


detunle fchar *name. 




44 


clk& elk. 




45 


PRT (symbol ) , 






PRT ( svmt VDe ) , 


5 


47 


PRT (byte) , 




48 


PRT (svncro) 




4 Q 


/ f 














10 


-J ^ 


int" caetAttr ( intAttr 






XAAUr X UAA \ / , 




54 


void define () ; 




55 


ctlfsm Sc fsm() ; 




56 ] 




15 


57 






58#endif 



, doublev=0) ; 



6.6 rx/detuple .cxx 
20 1// 

2 // COPYRIGHT 

3 // == ==== 

4 // 

25 5 // Copyright 1996 IMEC, Leuven, Belgium 

6 // 

7 // All rights reserved. 

8 // 

9// 

30 

10 // Module: 

11 // TUPLE 




12 // 



13 // Purpose: 

14//header detection + detuplelization @ (#) detuple . cxxl . 3 
98/04/07 



10 19 
20 

21#include "detuple .h" 
22#include " trans. h" 
23 

15 24 detuple: : detuple (char *name,clk& elk, 

25 _PRT ( symbol ) , 

2 6 _PRT ( symtype ) , 

27 _PRT(byte) , 

28 _PRT(syncro) 
20 29 ) : base (name), 

30 _ck(clk), 

31 I S_SIG (symbol, T_4bit) , 

32 IS_SIG (symtype, T_bit) , 

33 IS_REG(byte,_ck, T_8bit) , 
25 34 IS_REG(syncro,_ck, T_bit) 

35 { 

36 IS_IP (symbol) , 

37 I S_IP (symtype) ; 

38 IS_OP(byte) ; 
30 39 IS_OP(syncro) ; 

40 

4 1 debug_mode= 0 ; 



15 // 



16 // Author: 



17 // 



Radim Cmar 



18// 
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42 } 

43 

44 

45 detuple: : "detuple 0 { 
5 46 } 
47 
48 

49 int detuple :: setAttr (intAttr, double v) { 

50 switch (Attr) { 
10 51 case DEBUGMODE: 

52 debug_mode = (int)v; break; 

53 } 

54 return 1 ; 

55 } 
15 56 

57 

58 static int QAM16_sync[] = {0,0,5,5,0,0,5,5 ); 

59 static int QPSK_sync [ ] 
0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1}; 

20 60 static int QAM16_headlen= 8 ; 
61 static int QPSK_headlen= 16; 
62 
63 

64 int detuple :r:un() { 
25 65 int i; 
66 

67 static int tuplcnt= 0; 

68 static int corrcnt= 0; 

69 static int sync =0; 

30 70 static dfix oldstype= 0; 

71 static dfix corrarr[16] ; 

72 static dfix tuplarr[4] ; 



S?=5 
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73 

74 int headlen; 

75 int symbcount ; 

76 dfix tuple; 
5 77 

78 if ( (FBID(symbol) .getSize () 
<1) I I (FBID(symtype) .getSize {) <1) ) 

79 return 0; 
80 

10 81 dfix symb = FBID (symbol) .get () ; 
82 dfix stype = FBID (symtype) .get () ; 
83 

84 if (stype == QAM16) { //length of header depends on 

QAM16/QPSK constel 
15 85 headlen= QAM16_headlen; 

86 symbcount = 2; 

87 } 

88 else{ 

89 headlen= QPSK_headlen; 
20 90 symbcount = 4; 

91 } 
92 

93 if ( corrcnt== headlen) { 
94 

25 95 int equal =1; // search for 

header 

96 for(i = 0; i < headlen; i++) { 

97 if (stype == QAM16) 

98 equal = equal &( corrarr[i] ==QAM16_sync [headlen- 
30 1-i]); 

99 else 
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100 equal = equal &( corrarr[i] ==QPSK_sync [headlen- 
1-i]); 

101 } 
102 

5 103 if (equal) { // header 

appeared 
104 

105 if(stype == QAM16) //flush tuplarr (evenif not 

complete) 

10 106 tuple = tuplarr [0] + tuplarr [1] *16 ; 

107 else 
108 

tuple=tuplarr [0] +tuplarr [1] *4+tuplarr [2] *16+tuplarr [3] *64; 
109 FBID(byte) << (tuple); 

15 110 FBID(syncro) << (sync) ; 

111 

112 sync =1; // indicates start of 
frame 

113 cor rent = 1; 
20 114 tuplcnt= 0; 

115 } 

116 else{ ' // normal processing 
117 

118 if(tuplcnt== symbcount) { 

25 119 if (stype== QAM16) 

120 tuple = tuplarr [0] +tuplarr [1] *16; 

121 else 
122 

tuple=tuplarr [0] +tuplarr [1] *4+tuplarr [2] *16+tuplarr [3] *64 ; 
30 123 FBID(byte) << (tuple) ; 

124 FBID(syncro) << (sync) ; 

125 



sync = 0; 
tuplcnt = 1; 

} 

else 

tuplcnt ++; 
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126 
127 
128 
129 
5 130 

131 ) 

132 } 

133 else 

134 corrcnt++; 
10 135 

136 for(i = symbcount-l; i> 0 ;i--) 

137 tuplarrti] =tuplarr [i-1] ; 

138 tuplarr[0] =corrarr [headlen-1] ; //shift out the oldest 
symbol 

15 139 

14 0 for(i = headlen-1; i> 0 ;i--) // shift in new symbol 

141 corrarr[i] =corrarr [i-1] ; 

142 corrarr[0] =symb; 
143 

20 144 if ( oldstypel= stype) { // QPSK/QAM16 change 

145 corrcnt= 0; 

146 tuplcnt= 0; 

147 } 

148 oldstype= stype; 
25 149 

150 return 1; 

151) 

152 

153 

30 154// 



155 
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156 ctlfsm Sc detuple: : f sm() ( 

157 return _fsm; 
158} 

159 

5 160 void detuple :d refine 0 { 
161 int i; 
162 

163 int headlen_qam = 8; 

164 int headlen _qpsk= 16; 
10 165 int synibcount_qam = 2; 

166 int symbcount_qpsk= 4; 

167 #define max(a,b) ( (a> b) ?a : 
168 

169 dfix T_cnt (0, 5, 0,df ix: :ns) 
15 upto 32 

170 dfix T__symb(0, 4, 0,df ix: :ns). 

171 dfix T_tuple{0,8,0,dfix:n:s) ; 
172 

173 FSM( _fsm) ; 
20 174 INITIAL (rst) ; 

175 STATE (phasel) ; 

176 STATE (phase2 ) ; 

177 STATE (phase3) ; 

178 STATE (phase4 ) ; 
25 179 

180 SIGCK(qamtype, _ck, T_bit) ; 

181 SIGCK(old_qamtype, _ck, T_bit) 

182 S I GCK (symbol _reg,_ck, T^symb) 
183 

30 184 SIGCK(iniphase, _ck, T_bit) ; 

185 SIGCK (correlated, _ck, T_bit) ; 

186 SIGCK(tuple_ready,_ck, Tbit) ; 



b) 



// symbol counter 
// symbol type 0..15 
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187 

188 SIGCK(corrcnt, _ck, T_cnt) ; 

189 SIGCK(tuplcnt, _ck, T_cnt) ; 
190 

5 191 SIGCK(byte, _ck, T_tuple) ; 

192 SIGW(tuple_qaTn, T_tuple) ; 

193 SIGW (tuple_qpsk, T_tuple) ; 
194 

195 _sigarray tuplarr ( " tarr " , max (symbcount^qam, . 
10 syTnbcount__qpsk) , 

&_ck, T_symb) ; 

196 _sigarray corrarr ( "carr " , max (headlen^qam, 
headlen_qpsk) , 

&_ck,T_symb) ; 

15 197 _sigarray ref("ref", max(headlen_qam,headlen 

_qpsk) T,_symb) ; 

198 _sigarray equal ( "equal max (headlen_qam, 

headlen_qpsk) , 

T_bit) ; 

20 199 

200 // 



201 

202 SFG( tupler_reset) ; 
25 203 setv(corrcnt , 0) ; 

204 setv(tuplcnt, 0) ; 

205 setv{old_qamtype, 1) ; 

206 setv (syncro, 0) ; 
207 

30 208 SFG( tupler_read) ; 

209 GET(symbol); 

210 GET ( symt ype ) ; 
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211 syTnbol_reg=symbol ; 

212 qamtype = "symtype; 
213 

214 

5 215 SFG( tupler_test) ; 

216 iniphase= ((qamtype) & (corrcnt!= 



I ( ( "qamtype ) & ( corrcn t ! = 



W (T_cnt , headlen_qam) ) ) 
• 217 

W (T_cnt , headlen_qpsk) ) ) ; 
10 218 
219 

tuple_ready= (qamtype) . cassign ( tuplcnt==W {T_cnt , symbcount_qa 

m) , 

220 

15 tuplcnt==W (T_cnt , symbcount_qpsk) ) ; 
221 
222 

223 SFG( tupler _corr) ; 

224 for(i= 0; i < max (headlen_qam, headlen_qpsk) ; i++) { 

20 225 int iqam = (headlen_qam-l-i< 0) ? 0 : headlen_qam- 
1-i; 

226 int iqpsk = headlen _qpsk-l-i; 

227 ref[i] 
(qamtype) . cassign (W(T_symb,QAM16_sync [iqam] ) , 

25 228 W(T_symb, QPSK_sync [iqpsk] 

) ) ; 

229 if(i == 0) 

230 equal [i] = (corrarr[i] ==ref[i] ) ; 

231 else 

30 232 equal [i] = equal [i-1] & (corrarr[i] =-ref[i] ) ; 

233 } 

234 correlated= (qamtype) .cassign (equal [headlen_qam- 
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1] , equal [headlen_qpsk-l] ) ; 

235 
236 
237 

5 238 SFG(tupler_compose) ; 

239 tuple_qam= (cast (T_tuple, tuplarr [0] ) &W (T_tuple , 15) 
) 

240 I ((cast(T_tuple,tuplarr[l] ) W& (T_tuple, 15) ) 
<<W(T_cnt,4) ) ; 

10 241 

242 tuple_qpsk= (cast {T_tuple, tuplarr [0] & W (T_tuple, 3) ) 

243 I ( (cast (T_tuple, tuplarr [1] ) & W(T_tuple, 3) ) 
<<W(T__cnt,2) ) 

244 I ( (cast (T_tuple, tuplarr [2] ) & W(T_tuple, 3) ) 
15 <<W(T_cnt,4) ) 

245 I ( (cast (T_tuple,tuplarr[3] )& W (T_tuple, 3) ) 
«W(T_cnt, 6) ) ; 

246 

247 byte= (qamtype) . cassign (tuple_qam, tuple_qpsk) ; 
20 248 

249 tuplcnt= (correlated) .cassign(W(T_cnt, 0-1) , 

250 (tuple_ready) .cassign(W(T_cnt, 1-1) , 

251 tuplcnt) ) ; 
252 

25 253 corrcnt= (correlated) .cassign(W(T_cnt, 1-1), 
254 corrcnt) ; 

255 
256 

257 SFG(tupler_out) ; 
30 258 PUT (byte); 

259 PlJT(syncro) ; 

260 syncro= correlated; 
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261 
262 

263 SFG(tupler_shif tin ); 

264 for ( i = 1 ; i < max (symbcount_qatn, syTnbcount_qpsk) 
5 ;i++) 

265 tuplarr[i] =tuplarr [i-1] ; 

266 tuplarr (0] = (qamtype) . cassign (corrarr [headlen_qam- 
1] , corrarr [headlen_qpsk-l] ) ; 

267 

10 268 for(i = max (headlen_qam, headlen_qpsk) -1 ; i> 0 ;i--) 

269 corrarr [i] =corrarr [i-1] ; 

270 corrarr [ 0 ] =symbol_reg ; 
271 

272 
15 273 

274 SFG( tupler_f inish__qam) ; 

275 corrcnt= (old_qamtype ! = qamtype) , cassign (W (T_cnt,0), 

276 (corrcnt== W 
(T_cnt , headlen_qam) ) . cassign (corrcnt , 

20 277 corrcnt+ W (T_cnt,l) ) ) ; 

278 tuplcnt= (old_qamtype != qamtype) .cassign (W (T_cnt,0), 

279 (correlated) .cassign (W (T_cnt, 0) , 

280 (corrcnt !=W 
(T_cnt , headlen_qam) ) . cassign (tuplcnt, 

25 281 

(tuplcnt==W (T_cnt , symbcount_qam) ) . cassign (W (T_cnt , 1) , 

282 tuplcnt+ W (T_cnt,l) ) ) ) ) ; 

283 old_qamtype= qamtype; 
284 

30 285 SFG( tupler_f inish_qpsk) ; 

286 corrcnt= (old_qamtype ! = qamtype) . cassign (W (T_cnt,0), 
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287 (corrcnt==W (T_cnt , headlen 
_qpsk) ) . cassign (corrcnt , 

288 corrcnt+ W (T_cnt,l) ) ) ; 

289 tuplcnt= (old_qaintype!= qamtype) . cassign (W (T_cnt,0), 
5 290 (correlated) .cassign(W(T_cnt, 0) , 

291 (corrcnt 1=W 

(T_cnt ,headlen_qpsk) ) . cassign (tuplcnt , 

292 

(tuplcnt==W(T_cnt, symbcount^qpsk) ) . cassign (W (T_cnt, 1) , 
10 293 tuplcnt+ W (T_cnt,l) ) ) ) ) ; 

294 old_qamtype= qamtype; 
295 

296 // -- 

15 297 

2 98 AT (rst)ALLWAYS 

299 DO(tupler_reset) 

300 GOTO(phasel) ; 
301 

20 302 AT (phaseDTVLLWAYS 

303 DO(tupler_read) 

304 DO( tupler_test) 

305 D0( tupler_corr) 

306 GOT0(phase2) ; 
25 307 

308 AT (phase2)0N (_cnd (iniphase) | | lend (correlated) && 
!_cnd{tuple_ready) ) ) 

309 G0T0(phase4) ; 
310 

30 311 AT (phase2)0N ( !_cnd (iniphase) . && _cnd (correlated) ) 

312 DO (tupler_coTnpose) 

313 GOTO(phase3) ; 
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314 

315 AT (phase2)0N ( !_cnd (iniphase) && _cnd (tuple_ready) && 
!__cnd (correlated) ) 

316 DO (tupler_compose) 
5 317 G0T0(phase3) ; 

318 

319 AT (phase3)ALLWAYS 

320 DO{tupler_out) 

321 G0T0(phase4) ; 
10 322 

323 AT (phase4)ON (_cnd (qamtype) ) 

324 DO (tupler_shif tin) 

325 DO (tupler_f inish_qam) 

326 GOTO(phasel) ; 
15 327 

328 AT (phase4)0N ( !_cnd (qamtype) ) 

329 DO (tupler_shif tin) 

330 DO (tupler_f inish^qpsk) 

331 GOTO(phasel) ; 
20 332 

333 _fsm. setinfo (verbose) ; 

334 ofstream FO ( "detuple_transO ,dot " ) ; 

335 F0<< _fsm; 

336 FO .close () ; 
25 337 

338 transform TRANSF (_f sm) ; 

339 TRANSF. fsm_handshakel (_clc) ; 
340 

341 ofstream F("detuple_trans.dot") ; 
30 342 F « _fsm; 

343 F .close () ; 

344 _fsm. setinfo (silent) ; 



276 

345 

346 FSMEXP{typeName{ ) ) ; . 
347 
348} 
5 349 

6.7 rx/lmsff.h 

1 

10 2 // Author :Radim Cmar 

3 // Purpose: ADAPTIVE EQUALIZER (LMS) ®(#)lmsff.h 1.4 
98/03/30 
4 

5#infdef LMS_H 
15 6#define LMS_H 
7 

8#include "qlib.h" 

9#ifdef I2C 
10#include "i2c_Tnaster.h" 
20 ll#include "i2c_slave .h" 
12#endif 

13#include "macros . h" 
14#include "typedef ine . h" 
15 

25 16 class Imsff: public base{ 
17 

18 public: 

19 elk & _ck; 
20#ifdef I2C 

30 21 i2c_slave _slave; 
22#endif 

23 PRT(constel_mode) ; 
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24 PRT(in_sample) ; 

25 PRT(out_i); 

26 PRT(out_q); 

2 7 PRT ( symtype ) ; 

5 28 ctlfsm _fsm; 
29 

30 int constel _type; //QAM16or QPSK 

31 intSPS; // samples per symbol 

32 intCPS; // cycles per sample 
10 33 intNF; // forward taps 

34 intSTEP; // step adaptation constant 

35 double p0,pl,p2,p3; 

36 double ref; 
37 

15 38 public: 

39 enum { SPS_PAR, FWLENGTH, STEP_PAR, INIT, 
PO, PI, P2, P3,REF }; 

40 enum { QAM16, QPSK }; 
41 

20 42 Imsff (char *name, 

43 elk & elk, 

44 _PRT{constel_mode) , 

45 _PRT(in_sample) , 

46 _PRT(out_i) , 
25 47 _PRT(out_q), 

4 8 _PRT ( symtype ) 

49 ) ; 

50 

51 int setAttr(int Attr, double v=0) ; 

30 52 int run() ; 

53 void defineO; 

54 ctlfsm &fsm() ; 
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55#ifdef I2C 

56 i2c_slave &slave(); 

57#endif 

58 

5 59 //untimed mode 

60 dfix decide (dfix constel, dfix est) ; 

61 dfix coefi[lll] ; 

62 dfix coefq [111] ; 

63 dfix sample [111] ; 
10 64 

65 }; 
66 

67#endif 
.15 6.8 rx/lmsff .cxx 

1 

2 // Author :Radim Cmar 

3 // Purpose: ADAPTIVE EQUALIZER (LMS) @ (#) Imsf f . cxx 1.18 
20 98/04/07 

4 

5#include "Imsff.h" 
6#include <math.h> 
7#include " trans . h" 
25 8 

9 Imsf f :: Imsf f (char *name, 



10 clJc & cllc, 

11 _PRT (const el_mode) , 

12 _PRT(in_sample) , 
30 13 _PRT(out_i), 

14 _PRT(out_q), 

1 5 _PRT ( symtype ) 
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16 ) : base (name) , 

17 _ck(clk), 
18#ifdef I2C 

19 _slave (strapp (name, "_i2c_host") ) , 
5 20#endif 

21 IS_SIG (constel_mode, T_bit) , 

22 IS_SIG (in_sample, T_float) , 

23 IS_REG (out_i,_ck, T_f loat) , 

24 IS_REG (out_q, _ck, T_f loat) , 
10 25 IS_REG (symtype, _ck, T_bit) 

26 { 

27 IS_IP (const el_mode) ; 

28 IS_IP (in_sample) ; 

29 IS_OP(out_i) ; 
15 30 IS_OP(out_q) ; 

31 I S_OP ( symt ype ) ; 
32 

33 SPS = 4; 

34 STEP = 4; 
20 35 NF = 8; 

36 ref= 3.0; 

37 } 
38 

39 int Imsf f : :setAttr (int Attr, double v) { 

25 40 switch (Attr) { 

41 case SPS_PAR : // parametrizable only for untimed 
model 

42 SPS = (int) v; 

43 break ; 

30 44 case FWLENGTH : 

45 NF = (int) v; 

4 6 break ; 
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case Fz : 
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p2 = v; 
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break; 
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case F3 : 
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break ; 






case Khir : 




63 


ref= V; 






break; 




65 


case iNii : 




66 


cerr<< iiNru: luT'iorr ec^axizer . 




67 


for(int i=0; i < NF; i++) { 




68 


sample li J = atix(O) ; 




69 


coeti lij = arix(0;; 




70 


coerqlij = a£ix(0; ; 


ZD 


/ ± 


\ 
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/z 


xnc orrs = vwr-^^/z; 




73 


coefq[offs+ 0] = pO; 




74 


coef i [of f s+ 1] = pl; 




75 


coefqtoffs+ 2] = p2; 


30 


76 


coef i [of f s+ 3] = p3; 




77 


break; 




78 


} 
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79 return 1; 

80 } 
81 

82// 

5 - 
83 

84 int Imsf f : :run 0 { 

85 int i; 

86 dfix acci,accq, equali, equalq,esti, estq, erri,errq; 
10 87 

88 if ( (FBID(in_saTnple) .getSize ( ) <SPS) | | 
(FBID (constel__mode) .getSize () 1<) ) 

89 return 0; 
90 

15 91 dfix constel= FBID (constel_mode) .getlndex (0) ; 
92 dfix step = 1 . O/pow (2 . 0 , STEP) ; 
93 

94 // ff filtering--- 

95 acci= 0; 
20 96 accq= 0; 

97 for(i = 0; i < NF ; i+ + ) { 

98 acci= acci + sample [i] * coefiti] ; 

99 accq= accq + sample [i] * coefq[i] ; 
100} 

25 101 equali= acci; 
102 equalq= accq; 
103 

104 // output 

105 FBID(out_i) << (equali); 
30 106 FBID(out_q) << (equalq) ; 

107 FBID(symtype) << (constel) ; 
108 
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109 // slicing 

110 esti= decide (const el, equali) ; 

111 estq= decide (constel , equalq) ; 
112 

5 113 // error evaluation 

114 erri= esti - equali; 

115 errq= estq - equalq; 
116 

117 // coefficient adaptation 

10 118 for(i = 0; i < NF; i++) { 

119 coefi[i] =coefi[i] + step* erri * sample [i] 

120 coefq[i] =coefq[i] + step* errq * sample [i] 
121} 

122 

15 123 // reading in samples 

124 for(i = NF-1; i>= SPS; i--) 

125 sample [i] =sample [i-SPS] ; 

126 for(i = SPS-1; i>= 0; i--) 

127 sample [i] =FBID (in_sample) .get(); 
20 128 

129 return 1; 

130} 

131 

132 dfix Imsf f : :decide (df ix constel,dfix est) { 
25 133 double c = ref/3; 

134 if ( constel== QAM16) { 

135 if (est > dfix(2*c)) 

136 return dfix(3*c) ; 

137 else if (est > dfix(O)) 
30 138 return dfix (l*c) ; 

139 elseif (est > dfix(-2*c)) 

140 return dfix (-l*c); 
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141 



else 



142 



return dfix (-3*c)v' 



143 } 



else{ 



144 



if (est > dfix (0 . ) ) 



5 



145 



return dfix (3*c) ; 



146 



else 



147 



return dfix {-3*c); 



148 } 
149} 
10 150 

151// 



152 

153 ctlfsm & Imsf f : :f sm() { 
. 15 154 return__f sm; 



156 

157#ifdef I2C 

158i2c_slave &lmsf f :: slave {) { 
20 159 return _slave; 
160} 

161#endif 

162 

163 

25 164#define CC(a) cast (accu _type,a) 

165 void adder_tree (_sigarray & ops,int 1, int h, 
_sig&res) { 

166 if(h-l+l > 5) { 

167 cerr<< " Imsf f_error :_maximum__5_operands_suported\n" ; 
30 168 exit (1) ; 

169 } 

170 dfix& accu_type= res.Rep () ->getVal () ; 



155} 
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171 switch(h-l+l) { 

172 case 0: res = C(res,0) ; break; 

173 case 1: res = CC(ops[l] ) /break ; 

174 case 2: res = CC(ops[l] +ops [1+1] ) ; break; 

5 175 case 3: res = CC(ops[l] +ops[l+l]) + CC (ops [1+2] 
) ; break; 

176 case 4: res = CC( ops[l] +ops[l+l] ) + CC( ops [1+2] 

+ops[l+3] ) ;break; 

177 case 5: res = CC ( ops[l] +ops [1 + 1] ) + CC(CC 
10 (ops [1+2] 

+ ops [1+3] ) +CC(ops[l+4] ) ) ;break; 

178 } 
179} 
180 

15 181 void balance_coef s2 (int numcoef s, int numcycles, int* 
1, int* h) { 
182 int i,j,k; 
183 

184 int orig_numcycles=numcycles; 
20 185 if (numcoef s < numcycles) 
186 numcy c 1 e s = numcoe f s ; 
187 

188 int paral = numcoef s/numcycles; 

189 int incs= numcoef s- ( numcoef s/numcycles) *numcycles; 
25 190 

191 for(k = 1; k <= numcycles ; k++) 

192 l[k] = (k-l)*paral; 
193 

194 for(i = 1; i <= incs; i++) 
30 195 for(j = i+1; j<= numcycles ; j ++) 
196 ltj]++; 
197 
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198 . for(k = 1; k <= nuTncycles-l;k++) 

199 h [k] =l[k+l]-l; 

200 h [numcycles] =numcoef s-1 ; 
201 

5 202 for(k = numcycles+1; k<= orig__numcycles;k++) { 

203 l[k] =0; 

204 h [k] = -1; 

205 } 
206 

10 207 if(l) { 

208 cout<< "Imsf f_info:_f ilter_balancing\n" ; 

209 for(k =1; k <= orig _numcycles; k++) 

210 cout<< 1 [k] « ":"<< h [k] <<"_"; 

211 cout<< endl; 
15 212 } 

213} 

214 

215 

216 void Imsf f : :def ine {) { 
20 217 

218 if (NF < 6) { 

219 cerr<< "Imsf f_error :_Tninimum_6_coef s_required\n" ; 

220 exitd); 

221 } 
25 222 

223 int i,k/P; 
224 

225 //SPS .... samples per symbolparameter 

226 //CPS .... cycles per sample (every CPS-phase read 
30 sample) 

227 //NCYC . . . cycle budget in the loop 

228 // F _max _delay. . .extra delay line positions due to 
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read^sample within filtering 

229 SPS = 4; 

230 CPS = 2; 

231 int F_max_delay = 7; 
5 232 int NCYC = SPS*CPS; 

233 

234 //==distribute filtering operation slices into NCYC-2 
cycles= 
235 

10 236 int l_f [illOO] 

237 int h_f [illOO] 

238 int l_upd[100] 

239 int h_upd[100] 
240 

15 241 //budget is fixed : 8-2=6cycles 

242 //let's have 8 coefs 

243 //can be more elaborate (e ,g . interleaved slicing) 

244 int start_fil = 1 //for filtering to know to store 
first time 

20 245 int end_fil = 6 ; //for filtering to know to store to 
I_equal 
246 

l_f il [1] =0;l_fil [2] =2;l_fil [3] =4;l_f il [4] =5;l_f il [5] =6;l_fil [6] 
7; 

25 247 

h_fil [l]=l;h_fil [2]=3;h_fil[3]=4;h_fil [4]=5;h_fil [5] =6 ; h_f il [6] 

7; 

248 

l_upd[l] =0;l_upd[2] =2;l_upd[3] =4;l_upd[4] =5;l_upd[5] =6;l_upd[6] 
30 7; 
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249 

h_upd[l] =l;h_upd[2] =3;h_upd'[3] =4;h__upd[4] =5;h_upd[5] =6;h_upd[6] 
7; 

250 //was example what input we need for parametrizable 
5 filter 

definition 

251 

252 balance_coef s2 (NF, 6, l_f il,h_f il) ; 

253 balance_coef s2 (NF, 6, l_upd, h__upd) ; 
10 254 

255 // =======def inition of signals======= 

256 

257 PORT_TYPE(in _sample, T (T_sample_lms) ) ; 

258 PORT_TYPE (out_i , T (T_sample_lms) ) ; 
15 259 PORT_TYPE(out_q,T(T_sample_lms) ) ; 

260 

261 dfix T_step(0,5,0,dfix: :ns) ;// shifts 0-> 31 
262 

263 _sigarray Fi_coef { "Fi_coef " ,NF, &_ck, T (T_Fcoef_lms) ) ; 
20 264 _sigarray Fq_coef ( "Fq_coef " ,NF, &_ck, T {T_Fcoef_lms) ) ; 

265 _sigarray I_sample ( "I_sample" ,NF+F_max_delay, 

&_ck,T (T_sample_lms) ) ; 

266 _sigarray Fi_mult ( "Fi_Tnult " , NF, T (T_accu_lms) ) ; 

267 _sigarray Fq_mult ( "Fq_mult " ,NF, T (T_accu_lms) ) ; 
25 268 _sig Fi_suTn ( "Fi_sum" , T (T_accu_lms) ) ; 

269 _sig FcL_sum("Fq_sum",T {T_accu_lms) ) ; 

270 _sigarray fm _i ("fm_i" ,NF,T(T_accu_lms) ) ; 

271 _sigarray fTn_q{"fm_q" ,NF ,T (T_accu_lms) ) ; 

272 _sigarray f mult_i ( " f mult_i" ,NF, T (T_Fcoef _lms) ) ; 
30 273 _sigarray fmult_q("fmult_q",NF,T(T_Fcoef _lTns) ) ; 

274 SIGCK(I_accu,_ck, T (T_accu_lms) ) ; 

275 SIGCK(Q_accu, _ck, T (T_accu_lms) ) ; 
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276 SIGW(I_equal, T (T_accu_lins) ) ; 

277 SIGW(Q_equal, T (T_accu_lms) ) ; 

278 SIGCK(I_error,_ck, T (T_accu_lms) ) ; 

279 SIGCK(Q_error,_ck, T (T_accu_lms) ) ; 
5 280 SIGW( I_slice,T(T_equal _lms) ) ; 

281 SIGW(Q_slice, T(T_equal _lms) ) ; 

282 SIGCK(step, _ck, T_step) ; 

283 SIGCK(constel, _ck, T_bit) ; 
284 

10 285#ifdef I2C 

286 _slave .put (&step) ; 

287 for(i = 0; i < NF; i++) 

288 _slave.put (&Fi_coef [i] ) ; 

289 for(i = 0; i < NF; i++) 

15 290 _slave.put (&Fq_coef [i] ) ; 
291#endif 
292 
293 

294 // definitionof states 

20 

295 

296 cfsm= &_fsm; // controller handle 

297 

298 int phi; 
25 299 state* loop_cycle [100] ; 
300 state* rst_cycle; 
301 

302 rst_cycle=new state; // define the state 

303 * rst__cycle <<"rst"; // name the state 

30 304 * cfsm<< def It (*rst_cycle) ; // assign the state to the 

controller 

- 305 
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306 for (phi = 1; phi<- NCYC ;phi++) { 

307 loop_cycle [phi] =newstate; 

308 * loop_cycle [phi] <<strapp ("cycle_" ,phi) ; 

309 * cfsin<< *loop_cycle [phi] ; 
5 310 } 

311 

312// definition of sfg's-- 




313 

10 314 sfg* _lms_filt [100] ; 

315 sfg* _lms ^update _coef s [100] ; 

316 

317 

318 SFG( lms_read_allways) ; 

15 319 GET ( cons t el_mode ) ; 

320 constel= constel_mode; 

321 

322 

323 SFG( lTns__initialize_coefs) ; 

20 324 int offs= (NF-4)/2; 

325 Fq_coef [offs+0] =W (T(T_Fcoef_lms) ,p0) ; 

326 Fq_coef [offs+1] =W (T (T_Fcoef_lms) , 0) ; 

327 Fq_coef [offs+2] =W (T (T_Fcoef_lms) , p2 ) ; 

328 Fq_coef [offs+3] =W (T (T_Fcoef_lms) , 0) ; 
25 329 

330 Fi_coef [offs+0] =W (T (T_Fcoef_lms) , 0) ; 

331 Fi_coef [offs+1] =W (T (T_Fcoef_lms) , pi) ; 

332 Fi_coef [offs+2] =W (T (T_Fcoef_lms) , 0) ; 

333 Fi_coef [of fs+3] =W (T(T_Fcoef_lms) ,p3) ; 
30 334 

335 for{i = 0; i < NF; i+ + ) { 

336 if((i < offs) ScSc (i> offs+3)) { 
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337 



Fi_coef(i] =W(T(T_Fcoef 1ms), 0) 



338 



Fq_coef [i] =W (T (T_Fcoef_lms) ,0) 



339 } 

340 } 
5 341 

342 

343 SFG( lms_reset) ; 

344 for(i =: 0; i < NF4-F_max_delay ; i++) { 

345 I_sample[i] =W (T (T_sample_lms) 0 , ) ; 
10 346 } 

347 setv(I_error, 0) ; 

348 setv{Q_error, 0) ; 

349 setv (step, STEP) ; 
350 

15 351 

352 // FILTER (1 .cycle to 8. cycle) 



353 int delay = 0; int cnt= 0 ; 

354 int L,H; 
20 355 

356 //no filtering in 1st clockcycle 

357 cnt++;if (cnt == CPS) { cnt= 0; delay++; } 
358 

359 

25 360 for(p = 1; p <= NCYC-2;p++) { 

361 REGISTER_SFG (lms_f ilt , p) ; 

362 cnt++; if (cnt== CPS) (cnt = 0; delay++; 



* - 



363 



364 



// filter feedforward 



30 



365 



L = l_fil[p];H= h_fil[p] 



366 



for (k = L; k<= H; k++) 
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367 

Fi__mult [k] =cast (T(T_accu_lms) , Fi_coef [k] I*__sample [k+delay] ) 

368 if(H >= 0) adder_tree (Fi_mult,L,H,Fi_suin) ; 
5 369 

370 for (k = L; k<= H; k++) 
371 

Fq_mult [k] =cast (T {T_accu_lms) ,Fq_coef [k] *I_saraple [k+delay] ) 

f 

10 372 if(H >= 0) adder_tree (Fq_mult,L,H,Fq_sum) ; 
373 
374 

375 // sum I over start_ff-> end_ff 

376 if (p == start_fil) { 
15 377 I_accu= Fi_sum; 

378 Q_accu = Fq_sum; 

379 } 

380 else if ( (p > start_f il) && (p< end_fil)){ 

381 I_accu= I_accu+ Fi_sum; 
20 382 Q_accu = Q_accu+ Fq_sum; 

383 } 

384 else if (p == end_f il) { 

385 I__accu= I_accu+ Fi_sum; 

386 Q_accu = Q_accu+ Fq_sum; 
25 387 I_equal= I_accu+ Fi_sum; 

388 Q_equal = Q_accu+ Fq^sum; 

389 } 

390 } //end for 
391 

30 3 92 //compensate for 1 clockcycle vacancy 

393 cnt++;if (cnt == CPS) { cnt= 0; delay++; } 
394 
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395 

396 // UPDATE (1. cycle to 8. cycle) 

397 int STEPSAFE = 4; // safety region for 

downshifting 
5 398 for(p = 1; p <= NCYC-2;p++) { 

399 REGISTER_SFG(lms_update_coefs,p) ; 

400 cnt + +; if (cnt = = CPS) (cnt = 0; delay++; } 
401 

402 L = l_upd[p] ;H=h_upd[p] ; 
10 403 for (k=L; k<= H; k++) 

404 { 

405 fm_i [k] 
=cast (T (T_accu_lms) , I_sample [k+delay] *I_error) ; 

406 vshr (fmult_i [k] , fm_i [k] , step, STEPSAFE) ; 
15 407 Fi_coef[k] =Fi_coef [k] + fmult_i [k] ; 

408 
409 

f m_q [k] =cast (T {T_accu_lms) , I__sample [k+delay] *Q_error) ; 
410 vshr (fmult_q[k] ,fm_q[k] , step, STEPSAFE) ; 

20 411 Fq_coef[k] =Fq_coef [k] +fmult_q[k] ; 

412 } 

413 } 
414 

415 SFG (lms_out ready ) ; 
25 416 out_i=cast (T(T_sample_lms) ,I_equal); 

417 out_q= cast (T (T_sample_lms) ,Q_equal) ; 

418 symtype= constel; 
419 

420 

30 421 // SLICER 



422 SFG( 1ms slice and error); 
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423 double c = ref/3; 

424 I_equal=I_accu; 

425 Q_equal= Q_accu; 
426 

5 427 I_slice = (constel==W(T_bit, 0) )c.assign( 
428 

429 (I_equal> 
C(I_equal, +2*c) ) .cassign(C(I_slice, +3*c) , 

430 (I_equal> 
10 C (I_equal, 0*c) ) , cassign (C {I_slice, +l*c) , 

431 (I_equal> C(I_equal, -2*c) ) . cassign (C (I_slice, - 
l*c) , 

432 C{I_slice, - 
3*c)))) 

15 433 

434 (I_equal> 
C{I_equal, 0*c) ) .cassign(C(I_slice, +3*c) , 

435 C(I_slice, - 
3*c)) 

20 436 ) ; 
437 

438 Q _slice= (constel==W (T_bit,0) )c.assign( 

439 ' . - : 

440 (Q_equal > 
25 C(Q_equal, +2*c) ) . cassign (C (Q_slice, +3*c) , 

441 (Q_equal > C (Q_equal , 0*c) ). cassign ( 
C(Q_slice, +l*c) , 

442 (Q_equal > C (Q_equal , -2*c) ). cassign (C (Q_slice, - 
l*c), 

30 443 C(Q_slice,- 
3*c)))) 
444 
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445 (Q_equal > C (Q_equal , 0*c) ) . cassign ( 
C(Q_slice, +3*c) , 

446 C(Q_slice, - 
3*c)) 

5 447 ) ; 
448 

449 I_error=cast (T(T_accu_lTns) , I_slice) -I_equal ; 

450 Q_error=cast (T(T_accu_lms) , Q_slice) -Q_equal ; 
451 

10 452 

453 // 10 definition 



4 54 SFG(lms_in) ; 

455 GET (in_sample) ; 

15 456 l_sainple[0] =in_sample; 

457 for{i = NF+F_max_delay-1 ; i > 0; i--) { 

458 I_sample[i] =I_sample [i-1] ; 

459 } 
460 

20 461 SFG{lms_out) ; 

462 PUT(out_i) ; 

463 PUT(out_q) ; 

464 PUT (symtype) ; 
465 

25 466 

4 67 //=======def ine the fsmfor fixed 8 cycle timebudget 



468 

469 DEFAULTDO(lms_read_allways) ; 

30 470 * rst_cycle ALLWAYS 

471 DO (lms_reset) 

472 DO(lms_initialize coefs) 
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473 << *loop_cycle [1] ; 

474 

475 * loop_cycle [HALLWAYS 

476 DO{lms_in) 

5 477 << *_lms_update_coef s [1] 

478 << *loop_cycle [2] ; 

479 

480 * loop_cycle [2] ALLWAYS 

481 << *_lms_f ilt [1] 

10 482 << *_lms_update_coef s [2] 

483 << *loop_cycle [3] ; 

484 

485 * loop_cycle [3] ALLWAYS 
4 86 DO(lms_in) 
15 487 << *_lms_f ilt [2] 

488 << *_lms_update__coef s [3] 

489 << *loop_cycle [4] ; 
490 

491 * loop_cycle [4] ALLWAYS 
20 492 << *_lms_f ilt [3] 

4 93 << *_lms_update_coef s [4] 

494 << *loop_cycle [5] ; 

495 

496 * loop_cycle [5] ALLWAYS 
25 497 DO(lms_in) 

498 << *_lms_f ilt [4] 

499 << *_lms_update_coef s [5] 

500 << *loop_cycle [6] ; 
501 

30 502 * loop_cycle [6] ALLWAYS 

503 << *_lms_f ilt [5] 

504 << *_lnis_update_coef s [6] 
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505 << *loop_cycle [7] ; 

506 

507 * loop_cycle[7]ALLWAYS 

508 DO(lms_in) 

5 509 << *_lms_f ilt [6] // filtering finished-> ready to 

output 

510 DO (lms_out ready) 

511 << *loop_cycle [8] ; 
512 

10 513 * loop_cycle [8]ALLWAYS 

514 DO(lms_out) 

515 DO (lras_slice_and_error) 

516 « *loop_cycle [1] ; 
517 

15 518 

519#ifdef I2C 

520 _slave.attach(_fsm, *loop_cycle [1] ,_ck) ; 





521#endif 




522 




20 


523 


_fsm. setinfo (verbose) ; 




524 


of stream FO ("Imsf f_transO .dot 




525 


FO << _fsm; 




526 


FO .close 0 ; 




527 




25 


528 


transform TRANSF(_fsm) ; 




529 


TRANSF. f sm_handshakel (_ck) ; 




530 






531 


of stream F{"lmsf f_trans.dot") 




532 


F « _fsm; 


30 


533 


F . close 0 ; 




534 


_fsm. setinfo (silent) ; 




535 
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536 FSMEXP(typeName( ) ) ; 

537 

538) 

539 

6.9 rx/macros.h 



1 // ® (#) macros. hi. 1 98/01/22 
2 

10 3#infdef MACROS_H 
4#define MACROS_H 
5 

6 // #define max{a,b) (a> b) ?a : b 
7 

15 8#include "qlib.h" 
9 

10 extern dfix T_bit; 

11 extern dfix T_2bit; 

12 extern dfix T_4bit; 
20 13 extern dfix T_8bit; 

14 extern dfix T_float; 
15 

16 extern dfix T_Cshift; // type for constant shifter 

17 extern dfix* overcast; 
25 18 extern dfix yeast; 

19 extern strstream* gstr; 

20 

21 

22#define PRT(v) FB & ##v; _sigv 

30 23#define _PRT(v) FB & _##v 

24#define IS_SIG(v,t) . ##v(_##v) ,v(#v,t) 

25#define IS_REG (v, c, t) ##v(_##v) ,v(#v,c,t) 



298 

26#define GET(v) IN (v, ##v) 

27ttdefine PUT(v) . OUT(v, ##v) 

28ttdefine IS_OP(v) ##v.asSink (this) 

29#define IS_IP(v) ##v. asSource (this) 

5 30#define FBID (v) ##v 

31 

32#define C(y, x) . W ( (y) .Rep ( ) ->getVal ( ) ,x) 
33#define acast(y, x) cast ( (y) .Rep () ->getVal () , ##x) 
34 

10 35#define setv(y,x) y =W (y .Rep () ->getVal( ) ,x) ; 
36 

37#define REGISTER_SFG(s, i) _##s[i] =new sfg; \ 

38 _##s [i] ->next= glbListOfSfg; \ 

39 glbListOfSfg = _##s(i] ; \ 
15 40 * _##s[i] 

<<strapp(strapp(#s, "_") ,i) ; \ 

41 _##s[i] ->starts( ) ; \ 

42 csfg= _##s[i] 
43 

20 44ttdefine PORT_TYPE (v, t) v.Rep () ->dupVal(t) ; \ 

45 if (v.Rep 0 ->isregister 0 ) v.Rep 0 - 

>dupRegVal (t) 

46 

47#define DSIGW(s,n,w) s [n] 

25 =new_sig (strapp (strapp (#s, "_" ) ,n) ,w) 
48 

49// constant right-shift (division) 

50// 

30 51#define shr(y, x, b) \ 

52 overcast= new dfix(0, x.RepO- 

>getVal () .TypeWO +b,x.Rep() - 
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>getVal 0 .TypeLO +b) ; \ 

53 yeast .duplicate (y. Rep (> ->getVal 0) ; \ 

54 y= cast (yeast, cast (*overcast,x) >> W (T_Cshif t , b) ) ; 
\ 

5 55 delete overcast; 
56 

57// constant left-shift (multiplication) 

58//-- - 

10 59#define shl(y, x, b) \ 

60 if (x.RepO ->getVal() .isFixO ) \ 

61 overeast= new dfix(0,x .RepO- 
>getVal () .TypeWO +b,x.Rep() - 

>getVal 0 .TypeL( ) ) ; \ 
15 62 else\ 

63 overcast = new dfix(O) ; \ 

64 yeast .duplicate (y.RepO ->getVal( ) ) ; \ 

65 y= cast (yeast, east (*overcast ,x) << W (T_Cshif t , b) ) ; 
\ 

20 66 delete overcast; 
67 

68// variable shifters with safety region 

69// - -- --- 

25 -- 

70 // 

71 // description vshl (y,x,e,b) : = :y = x<<e (with 'b' as' a 
safety 

region) 

30 72 // 

73#define vshl (y, x, e, b) \ 



a 
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74 overcast = new dfix(0, x.RepO- 
>getVal 0 .TypeWO +b,x.Rep () - 

>getVal() .TypeL( ) ) ; \ 

75 y= acast (y, cast (*overcast ,x) << e ) ; \ 
5 76 delete overcast; 

77 

78#define vshr(y, x, e, b) \ 

79 if (x.RepO ->getVal 0 .isFixO ) \ 

80 overcast= new dfix(0,x ,Rep()- 
10 >getVal() .TypeWO +b,x.Rep() - 

>getVal ( ) . TypeL ( ) +b) ; \ 

81 else\ 

82 overcast new dfix(O) ; \ 

83 y= acast (y, cast {*overcast ,x) >> e ) ; \ 
15 84 delete overcast; 

85 
86 

87#endif 
88 



20 



6.10 rx/macros . cxx 



1 # i nc 1 ude " macros . h " 
2 

25 3 dfix T_bit (0, 1, 0,df ix: :ns) ; 

4 dfix T_2bit (0, 2, 0,df ix: :tc) ; 

5 dfix T_4bit (0,4, 0, dfix: :ns) ; 

6 dfix T_8bit (0, 8, 0,df ix: :ns) ; 

7 dfix T_float (0) ; 
30 8 

9 dfix T_Cshift (0,4,0, dfix:n:s) ;//type for constantshif ter 
0. .15 
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10 dfix* overcast; 



11 dfix yeast; 



12 strstream* gstr; 



5 



6.11 



rx/ typedef ine . cxx 



l#include "typedef ine .h" 
2 

3#include <fstream.h> 
10 4 

5 typedef ine glbTypes; 
6 

7 typedef ine: : typedef ine 0 ( 

8 numt= 0; 
15 9 } 

10 

11 void typedef ine: : load (char *_name) { 

12 if stream IF(_name); 
13 

20 14 ifdF.failO) { 
15 

cerr< < " * * *_ERROR : _typedef ine : _cannot_open_f i 1 e_" < <_name< < " \ 



25 17 } 
18 

19 whiledlF.eof 0 && !IF.f a(i)l) { 

20 char buf [100] ; 

21 IF >> buf; 
30 22 

23 if { !strlen(buf ) ) 

24 continue; 



n" 



16 



exit (0) ; 
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25 

26 if (buf [0] == ScSc buf [1] '/•) { 

27 int endoftype = 0; 

28 while ( 1 endoftype) { 

29 char c; 

30 IF. get (c) ; 

31 endoftype- (c == '\n* ) ; 

32 } 

33 continue; 

34 } else { 

35 name[numt] = new char [strlen (buf ) +1] ; 

36 strcpy (name [numt] ,buf ) ; 

37 int i; 

38 for (i=0; i<numt; i++) 

39 if ( !strcmp(name [i] ,buf ) ) { 

40 cerr<< 
"***_ERROR:_typedef ine :_type__"<<buf <<"_def ined_twice\n" ; 

41 exit(O); 

42 } 

43 int 
W, L, repr=df ix : : tc , overf low=df ix : e : rr , truncate=df ix : f : 1 ; 

44 

45 IF >> buf; 

46 W = atoi (buf) ; 

47 if (W == 0) { 

48 cerr<< " ***_ERROR :_typedef ine :_bad_W_f or_type_" 
<<name [numt] "<<\n" ; 

49 exit (0) ; 

50 } 
51 

52 int endcom = 0; 
53 
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54 IF >> buf; 

55 L = atoi (buf) ; 

56 if (buf [strlen(buf)-l] ==•;•){ 

57 endcom = 1; 

5 58 buf [strlen(buf ) -1] =0; 

59 } 

60 while (1) { 

61 if (endcom) 

62 break; 
10 63 

64 IF » buf; 
65 

66 if (buf [strlen(buf)-ll ==»;•){ 

67 endcom = 1; 

15 68 buf [strlen(buf ) -1] =0 ; 

69 } 
70 

71 if( !strcmp(buf , "ns") ) 

72 repr = dfix: :ns; 

20 73 else if ( Istrcmp (buf , "tc") ) 

74 repr = dfix::tc; 

75 else if ( istrcmp (buf , " ; " ) ) 

76 break; 

77 else if ( ! endcom) { 

25 78 cerr<< "***_ERROR:_typedef ine :_"<<name [numt] "<< 
_bad_repr_"<<buf <<"\n" ; 

79 exit (0) ; 

80 } 
81 

30 82 

83 if (endcom) 

84 break; 
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85 

86 IF >> buf; 
87 

88 if (buf [strlen{buf)-l] ==»;•){ 

5 89 endcom = 1; 

90 buf [strlen{buf ) -1] =0 ; 

91 } 
92 

93 if( !strcTnp(buf , "wp") ) 

10 94 overflow = df ix: :wp; 

95 elseif f !strcmp(buf , "st") ) 

96 overflow = dfix::st; 

97 elseif ( !strcmp(buf , "er") ) 

98 overflow = dfix: :err; 

15 99 elseif ( ! strcmp (buf , " ; " ) ) 

100 break; 

101 elseif ( lendcom) { 

102 cerr<<"***_ERROR:_typedef ine:_"<<name [numt] "<< 
_bad_ovf_"<<buf <<"\n" ; 

20 103 exit(O); 

104 } 
105 

106 if (endcom) 

107 break; 
25 108 

109 IF >> buf; 
110 

111 if (buf [strlen(buf) -11 ==';•){ 

112 endcom =1; 

30 113 buf (strlen(buf ) -1] =0 ; 

114 } 
115 
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116 if( !strcmp(buf , "rd") ) 

117 truncate = dfix::rdp 

118 elseif ( !strcmp(buf , "f 1") ) 

119 truncate = dfix::fl; 

5 120 elseif ( !strcmp(buf , " ; ") ) 

121 break; 

122 elseif ( lendcom) { 

123 cerr<< 
it***^ERROR:_typedef ine:_"<<name [numt] "<< :_bad__rnd_" * 

10 *<<buf<<"\n" ; 

124 exit(O); 

125 } 
126 

127 if(endcom) 

15 128 break; 
129 

13 0 int endoftype = 0; 

131 while ( ! endoftype) { 

132 char c; 

20 133 IF.get(c); 

134 endoftype = (c== '\n * ) ; 

135 } 

136 break; 

137 } 
25 138 

types [numt] .duplicate (dfix(0,W, L, repr, overflow, truncate) ) ; 
139 

14 0 numt++; 

141 if (numt >= MAXT) { 

30 142 cerr<< "***_ERROR: 

_typedef ine_has_too_much_types ._increase_MAXT\n" ; 

143 exit(O); 



# 
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144 } 

145 } 

146 } 
147} 

5 148 

14 9 void typedef ine: :list 0 { 

150 int i; 

151 

152 for(i=0; i<numt; i++) { 

10 . 153 cout .width (20) ; 

154 cout<< naTne[i] ; 
155 

156 cout .width (5) ; 

157 cout<< types[i] .TypeW(); 
15 158 

159 cout .width (5) ; 

160 cout<< types[i] .TypeLO ; 
161 

162 cout .width (4) ; 

20 163 if (types [i] .TypeSignO ==dfix: :ns) 

164 cout << "ns"; 

165 else 

166 cout << "tc"; 
167 

25 168 cout .width (4) ; 

169 if (types [i] .TypeOverf low() =:=df ix: :wp) 

170 cout << "wp"; 

171 elseif (typesti] .TypeOverf low () ==dfix::st) 

172 cout << "St"; 
30 173 else 

174 cout << "err"; 
175 
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176 cout . width (4 ) ; 

177 if(types[i] .TypeRoundO ==dfix::fl) 

178 cout << "fl"; 

179 else 

5 180 cout << "rd"; 

181 

182 cout<< "\n"; 

183 } 
184} 

10 185 

186 static dfix dummy (0) ; 
187 

188dfix &typedefine :: find (char *_name) { 
189 int i; 
15 190 if( !numt) 

191 return dummy; 

192 for(i=0; i<numt;. i++) 

193 if ( ! strcmp (name [i] ,_name) ) 

194 return types [i] ; 

20 195 cerr<<"***_WARNING:_typedefine: 

_type_" <<_name<< "_was_not_f ound\n" ; 
196 return dummy; 
197} 
198 

25 199 dfix &typedefine :: find (char *_name, dfix& v) { 

200 int i; 

201 if( Inumt) 

202 return v; 

203 for(i=0; i<numt; i++) 

30 204 if( ! strcmp (name [i] ,_name) ) 

205 return types [i] ; 

206 cerr<< "***_WARNING:_typedef ine : 



308 

_type_" < <_name<< "_was_not_f ound\n " ; 
207 return v; 
208) 
209 

5 

6 . 12 rx/typedef ine . h 



l#infdef TYPEDEFINE_H 
2#define TYPEDEFINE_H 
10 3 

4#define MAXT 100 
5 

6#include "qlib.h" 
7 

15 8 

9 class typedefine{ 

10 char *name[100] ; 

11 dfix types [MAXT] ; 

12 int numt; 
20 13 public: 

14 typedef ine ( ) ; 

15 void load(char *f ile) ; 

16 void listO; 

17 dfix 6cfind(char *name) ; 

25 18 dfix 6cfind(char *name, dfix& v) ; 

19 }; 
20 

21 extern typedef ine glbTypes; 
22 

30 23#define LOADTYPES(a) glbTypes . load (#a) ; glbTypes . list () 
24#define T(a) glbTypes . find (#a) 
25#define TT(a,b) glbTypes . find (#a, b) 



27#endif 

Part C: Generated VHDL code of the QAM system 
6.13 vhdl/RX_TI.vhd 
1 

2 --OCAPI - alpha release- generated Fri Jun 12 
16:45:441998 

3 

4 

5 - System Link Cell for design RX_TI 
6 

7 library IEEE; 

8 use IEEE. std_logic_1164 .all; 

9. 

10 entity RX_TI is 



11 port( 

12 reset: in std_logic; 

13 elk: in std_logic; 

14 chan_out: in std_logic_vector (11 downto 
0); 

15 rx_dif f_mode: in std_logic; 

16 rx_constel_mode : in std_logic; 

17 rx_byte_out : out std_logic_vector (7 downto 
0); 

18 rx_sync_out : out std^logic 

19 ) ; 



20 end RX_TI ; 
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21 

22 architecture structure of. RX_TI is 
23 

24 component 1ms ff 

5 25 port( 

26 reset: in std__logic; 

27 elk: in std_logic; 

28 hlwack: in std_logic; 

29 constel_mode : in std_logic; 

10 30 in_sample: in std_logic_vector (11 dovmto 
0); 

31 hlwreq: out std_logic; 

32 out_i:out std_logic_vector ( 11 downto 
0) ; 

15 33 out_q: out std_logic_vector ( 11 downto 

0); 34 symtype: out std_logic 

35 ) ; 

3 6 endcomponent ; 
37 

20 38 component demap 

39 port( 

40 reset: in std_logic; 

41 elk: in std_logic; 

42 h2wack: in std_logic; 
25 43 hlrack: in std_logic; 

44 diff_mode: in std_logic; 

45 i_in: in std_logic_vector (11 dovmto 
0) ; 

46 q_in: in std_logic_vector ( 11 downto 
30 0); 

47 symtype^in:. in std_logic; 

48 h2wreq: out std_logic; 
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49 hlrreq: out std__logic; 

50 syTnbol_out»: out std_logic_vector (3 downto 
0) ; 

51 symtype_out : out std_logic 
5 52 ) ; 

5 3 endcomponen t ; 
54 

55 component detuple 

56 port ( 

10 57 reset: in std_logic; 

58 elk: in std_logic; 

59 h3wack: in std_logic; 

60 h2rack: in std_logic; 

61 symbol: in std_logic_vector (3 downto 
15 0) ; 

62 symtype: in std_logic; 

63 h3wreq: out std^logic; 

64 h2rreq: out std_logic; 

65 byte: out std_logic_vector (7 downto 
20 0) ; 

66 syncro: out std_logic 

67 ) ; 

6 8 endcomponent ; 
69 

25 70 component derand 

71 port ( 

72 reset: in std_logic; 

73 elk: in std^logic; 

74 h3rack: in std_logic; 

30 75 byte_in: in std_logic_vector (7 downto 
0) ; 

76 syncro: in std_logic; 



11 

78 
0) ; 
79 
5 80 
81 
82 
83 
84 

10 85 
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h3rreq: out std_logic; 

byte_out : out std_logic_vector (7 downto 
sync^out rout std_logic 



) ; 

endcomponent ; 



signal 
signal 
signal 
downto 0) ; 

86 signal 
downto 0) ; 

87 signal 
15 88 signal 

89 signal 

90 signal 
downto 0) ; 

91 signal 
20 92 signal 

93 signal 

94 signal 
downto 0) ; 

95 signal 
25 96 signal 

97 

98 begin 
99 

100 Imsf f_proc:lmsf f 
30 101 port map { 

102 

reset, 



unused : std_logic; 
hl_f f shk : std_logic ; 

rx_lms_i : s td_logic_vector ( 1 1 

rx_lms_q : s td_logic_vec tor ( 1 1 



rx_symtype 
h2_ffshk 
hi fbshk 



std_logic; 
std_logic; 
std_logic; 



rx_symbol : std__logic_vector (3 

rx_symtype_at : std_logic ; 

h3_f f shk: std_logic; 
h2_f bshk : std_logic ; 

rx_byte_rnd : std_logic_vector (7 

rx_syncro : std_logic ; 
h3_f bshk : std_logic ; 



reset=> 
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103 

elk, 

104 

hl_fbshk, 
5 105 

rx_constel_mode , 
106 

chan_out , 
107 

10 hl_ffshk, 
108 

rx_lms , _i 
109 

rx_lms , _q 
15 110 

rx_symtype 

111 ) ; 
112 

113 demap^proc: demap 
20 114 port map ( 

115 

reset , 
116 
elk, 
25 117 

h2_fbshk, 
118 

hl_ffshk, 
119 

3 0 rx_d i f f _mode , 
120 

rx 1ms, _i 



clk=> 
hlwack=> 
constel_mode=> 
in_sample=> 
hlwreq=> 
out_i = > 
out_q=> 
symtype=> 

reset=> 
clk=> 
h2waek=> 
hlraek=> 
dif f_mode=> 
i in=> 




121 

rx_lms , _q 
122 

rx_symtype , 
5 123 

h2_ffshk, 
124 

hl_fbshk, 
125 

10 rx^symbol , 
126 

r x_s ymt yp e_a t 
127 ) ; 
128 

15 129 detuple j>roc rdetuple 
13 0 port map ( 

131 

reset, 
132 
20 elk, 
133 

h3_fbshk, 
134 

h2_ffshk, 
25 135 

rx_syTnbol , 
136 

rx_syTOtype_at , 
137 

30 h3_ffshk, 
138 

h2 fbshk, 
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q_in=> 



symtype_in=> 



h2wreq=> 



hlrreq=> 



symbol_out=> 



symt ype_ou t = > 



reset=> 



clk=> 



h3wack=> 



h2rack=> 



symbol=> 



symtype=> 



h3wreq=> 



h2rreq=> 
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>u5 



in 



139 

rx_byte_rnd, 
140 

rx_syncro 
5 141 ) ; 
142 

143 derand_proc : derand 

144 port map ( 
145 

10 reset, 
146 
elk, 
147 

h3_ffshk, 
15 148 

rx_byte_rnd , 
149 

rx_syncro, 
150 

20 h3_fbshk, 
151 

rx_byte_out , 
152 

rx_sync_out 
25 153 ) ; 
154 

155 end structure; 



byte=> 



syncro=> 



reset=> 
clk=> 
h3rack=> 
byte_in=> 
syncro=> 
h3rreq=> 
byte_out=> 
sync_out=> 



6.14 vhdl /derand_j)roc__ENT . vhd 



30 
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2 --OCAPI - alpha release- generated Thu Jun 11 14:57:23 



1 QQft 

d. 27 17 O 




3 -- 


-- includes sfg 


4 


derandrstphaselO 


5 -- 


derandphaselphase20 


6 -- 


derandphaselphasell 


7 


de r andpha se2phasel0 


8 


derandinireg_derandrstO 



10 - 
10 

11 library IEEE; 

12 use IEEE. std_logic_1164 -all; 

13 useIEEE.std_logic_arith.all; 
15 14 library FXT_PNT_LIB; 

15 use FXT_PNT_LIB.pck_fixed_point .all; 
16 

17 entity derand_j)roc is 

18 port{ 

20 19 elk: in std_logic; 

20 reset: in std_logic; 

21 h3rack: in FX (0 downto 0) ; 

22 syncro: in FX (0 downto 0); 

23 byte_in:in FX (7 downto 0); 
25 24 h3rreq: out FX (0 downto 0 ); 

25 h3rackreg_reg:outFX (0 downto 0) ; 

26 byte_ouT_reg:outFX{7 downto 0); 

27 sync_ouT_reg:outFX (0 downto 0) 

28 ) ; 

30 2 9 end derand_proc; 

6 . 15 vhdl/derand_proc_RTL. vhd 



# 
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1 



2 --OCAPI - alpha release- generated Thu Jun 11 14:57:23 
5 1998 

3 includes sfg 

4 derandrstphaselO 

5 -- derandphaselphase20 

6 derandphaselphasell 
10 7 -- derandphase2phasel0 

8 -- derandinireg_derandrstO 

9 



15 11 library IEEE; 

12 use IEEE. std_logic_1164. all; 

13 uselEEE. std_logic_arith.all ; 

14 library FXT_PNT_LIB; 

15 use FXT_PNT_LIB.pck_fixed_point .all; 
20 16 

17 architecture RTL of derand_proc is 
18 

19 -- State Declaration 

20 signal seed_atl: FX (15 downto 0); 
25 21 signal seed : FX (15 downto 0); 

22 signal shif treg_atl : FX (15 downto 0) ; 

23 signal shiftreg : FX (15downto 0) ; 

24 signal bypass_atl: FX(0 downto 0); 

25 signal bypass : FX (0 downto 0) ; 

30 26 signal h3rackreg_atl : FX (0 downto 0) ; 

27 signal h3rackreg : FX(0 downto 0); 

28 signal byte_out_atl : FX (7 downto 0); 



10 
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29 signal byte_out : FX (7 downtoO) ; 

30 signal sync_out_atl : FX (0 downto 0) ; 

31 signal sync_out : FX (0 downtoO) ; 

32 type STATE_TYPE is ( 
5 33 rst, 

34 phasel, 

35 phase2, 

36 inireg_derand) ; 

37 signal current_state, next_state : STATE_TYPE; 
10 38 

39 begin 
40 

41 h3 rackreg_reg< =h3 rackr eg_a 1 1 ; 
42 

15 43 byte_out_reg<=byte_out_atl ; 
44 

4 5 sync_out_reg< =sync_out_at 1 ; 
46 

47 Register clocking 

20 48 SYNC : process (elk) 
49 

50 begin 

51 if (elk 'event and clk= 'l* ) then 

52 -- state update 

.25 53 current_state<= next_state; 

54 tick all registers 

55 seed_atl<= seed; 

56 shif treg_atl<= shiftreg; 

57 bypass_atl<= bypass; 

30 58 h3rackreg_atl<= h3rackreg; 

59 byte_out_atl<=byte_out ; 

60 sync_out_atl<=sync_out ; 



En 
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end if; 








end process; 




DO 










-- SFG evaluation 




O 3 


COMB : process ( 




DO 


current_ 


state. 




\3 / 


reset , 






o o 


h3rack. 






^ Q 
D y 


syncro, 




xu 


/ u 


seed_atl. 




/ X 


shif treg 


_atl. 






bypass_atl , 




73 


byte_in, 






/ ^ 


h3rackreg_atl , 




/ D 


byte_out 


_atl, 




/ 6 


sync_out 


_atl ) 




/ / 








to 


-- intermediate variables 




n Q 
/ y 


variable 


shifts_0 : FX(15 downto 0) ; 


O ft 


oU 


variable 


xbits_0: FX (0 downto 0); 




O 1 


variable 


masks_0 :FX (7 downto 0); 




o o 


variable 


shifts_l : FX(15 downto 0) ; 




Q "5 
O J 


variable 


xbits_l:FX (0 downto 0); 






variable 


masks_l :FX (7 downto 0) ; 




Q C 


variable 


shifts_2 : FX (15 downto 0) ; 




Q 

O O 


variable 


xbits_2:FX (0 downto 0); 




ft ^ 

87 


variable 


Tnasks_2 :FX (7 downto 0) ; 




88 


variable 


shifts_3 : FX(15 downto 0) ; 




89 


variable 


xbits_3:FX (0 downto 0); 
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90 


variable 


masks_3 :FX (7 downto 0); 




91 


variable 


shifts_4 :.FX(15 downto 0) ; 




92 


variable 


xbits_4:FX (0 downto 0); 



320 







y J 


variable 


masKS 4 : rA ^ / aownco v) ; 








va J- laoxe 


snircs D : ■ ta^xd aownco o^ 






y 3 


V anaDxe 


xx^xcs DirA \\j aownco \)) § 






y D 


var xaDxe 


niasK.s D 2 rA V ' aowrico , 




c 
3 


y / 


vanaDie 


snixcs D • rA^XD aownco 






98 


variable 


xDics o:rA \\j aownco u; ; 






y y 


variable 


masKS o:rA \i aownco ; 






100 


variable 


snixcs / : rA^io aownco kj) 






101 


variable 


xDics /:rA iO aownco u; ; 




T A 
XU 


lOz 


variable 


masKS / : rA y / aownco ; 








vdx Xduxc 


cs>^^f^a ft • FX M R downhn 0^ 

oXlXXu>o O • FAyX^ UWWilL.W \J / 


tt=S. 

c s 




lU^ 


vanaoxe 


luasKS o t r A \ / aownco v j , 






X U 3 


V dX XctJhJXC 




"'4 




X u o 






in 


1 c 


lU / 


begin 








1 Oft 










X u y 


-- update 


all registers and outputs 


5 




± X u 


h3rreq <= 


CAST ("0. " ) ; 


O 




111 


seed <= seed_atl; 


0 


ZV 


llz 


shif treg< 


:= shif treg_atl ; 


■ : 

E — 




113 


bypass <= 


bypass_atl; 






114 


h3rackreg 


<= h3rack:reg_atl; 






X J. D 


byte_out<= 


: byte_out_atl; 






lib 


sync_out<= 


• sync_out_a t 1 ; 






TIT 

iio 










119 


default 


update state register 






120 


next_state<=current_state; 






121 








30 


122 


case current_state is 






123 










124 


when rst 


=> 
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125 

126 byte_out<= CAST (/' 00000000 . " ) ; 

127 seed <= CAST ("0000000000111111. " ) ; 

128 sync_out<= CAST("0 . " ) ; 
5 129 bypass <= CAST("0 . " ) ; 

130 shiftreg<= CAST ( "0000000000000000 . " ) ; 

131 h3rack:reg<= h3rack; 

132 h3rreq <= CAST("1 . " ) ; 

133 next__state<= phasel; 
10 134 

135 

136 when phasel=> 
137 

138 if ( (true) and( ToBool (h3rackreg_atl) ) ) then 

15 139 shifts_0:= cassign (syncro=CAST ( "1 . " ) , 

140 seed_atl, 

141 shiftreg_atl) ; 



142 masks^O :=CAST ("00000000. " ) ; 

143 xbits_0:= 

20 

(CAST(0,0,SHR(shifts_0,4) ) ) xor (CAST (0 , 0 , SHR (shif ts_0 , 5) ) ) ; 
144 

shif ts_l : = ( (CAST (15 , 0 , xbits_0) ) and (CAST ( "0000000000000001 . " 
))) 

25 or ( (SHL(shifts_0, 1) )and(CAST("0000000001111111. " ) ) ) 

145 masks_l : = (SHL (masks_0, 1) )or ( (CAST(7, 0,xbits_0) ) and 
(CAST("00000001. " ) ) ) ; 

146 .xbits_l : = 

30 

(CAST(0, 0,SHR(shifts_l,4) ) ) xor (CAST (0 , 0 , SHR (shifts 1,5))); 
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147 

shifts_2: = ( (CAST (15, 0,xbits_'l) ) and (CAST ( "0000000000000001 . " 
))) 

or( (SHL(shifts_l,l) )and(CAST("00000000011111H. " ) ) ) 

5 ; 

148 masks_2 SHL (masks_l, 1) )or ( (CAST(7, 0,xbits_l) ) and 

(CAST ("00000001. " ) ) ) ; 
14 9 xbits_2:= 
(CAST (0 , 0 , SHR (shif ts_2 , 4 ) ) ) xor (CAST (0 , 0 , SHR (shif ts_2 , 5) ) ) ; 
10 150 

shifts_3 := ( (CAST(15,0,xbits_2) ) and (CAST ( "0000000000000001 . " 
))) 

or( (SHL(shifts_2,l) )and(CAST("0000000001111111. " ) ) ) 

} 

15 151 Tnasks_3 SHL (masks_2 , 1) ) or ( (CAST (7 , 0 , xbits_2) ) and 
(CAST ("00000001. " ) ) ) ; 
152 xbits_3: = 

(CAST(0,0,SHR(shifts_3,4) ) ) xor (CAST (0 , 0, SHR (shif ts_3 , 5) ) ) ; 
153 

20 shifts_4 := ( (CAST ( 15 , 0 , xbits_3 ) ) and (CAST (" 0000000000000001 . " 
))) 

or ( (SHL(shifts_3, 1) )and(CAST("0000000001111111. " ) )• ) 

154 masks_4 := SHL (masks_3 , 1) ) or ( (CAST (7 , 0 , xbits_3) ) and 
25 (CAST("00000001. " ) ) ) ; 

155 xbits_4:= 
(CASt(0, 0,SHR(shifts_4,4) ) ) xor (CAST (0 , 0, SHR (shif ts_4 , 5) ) ) ; 
156 

shif ts_5 : = ( (CAST (15 , 0 , xbits_4) ) and (CAST ( "0000000000000001 . " 
30 ))) 

or ( (SHL (shif ts_4,l) ) and (CAST ("0000000001111111. " ) ) ) 




323 

157 masks_5 := SHL{masks_4, 1) )or ( (CAST(7, 0,xbits_4) ) and 
(CAST ("00000001. " ) ) )■ ; 

158 xbits_5:= 
(CAST(0,0,SHR(shifts_5,4) ) ) xor (CAST (0 , 0 , SHR (shif ts_5 , 5) ) ) ; 

5 159 

shifts_6:=( (CAST(15,0,xbits_5) ) and (CAST (" 0000000000000001 . " 
))) 

or( (SHL(shifts_5,l) )and(CAST("0000000001111111. " ) ) ) 

10 160 masks_6 := SHL (masks_5, 1) ) or ( (CAST (7, 0, xbits_5) ) and 
(CAST ("00000001. " ) ) ) ; 
161 xbits_6:= 
(CAST(0, 0,SHR(shifts_6,4) ) ) xor (CAST (0 , 0 , SHR (shif ts_6 , 5) ) ) ; 
162 

15 shifts_7: = ( (CAST (15,0, xbits_6) ) and (CAST (" 0000000000000001 . " 
))) 

or ( (SHL (shif ts_6,l) ) and (CAST ("0000000001111111. " ) ) ) 

163 niasks_7 := SHL (masks_6, 1) ) or ( (CAST (7, 0, xbits_6) ) and 
20 (CAST ("00000001. " ) ) ) ; 

164 xbits_7:= 
(CAST(0,0,SHR(shifts_7,4) ) ) xor (CAST (0, 0 , SHR (shif ts_7, 5) ) ) ; 
165 

shifts_8: = ( (CAST (15, 0,xbits_7) ) and (CAST ( "0000000000000001 . " 
25 ))) 

or ( (SHL(shifts_7, 1) )and(CAST("0000000001111111 . " ) ) ) 

t 

166 masks_8 := SHL(masks_7, 1) )or ( (CAST(7, 0,xbits_7) )and 
(CAST ("00000001. " ) ) ) ; 
30 167 shiftreg<= shifts_8; 

168 mask := masks_8; 

169 byte_out<= cassign (bypass_atl=CAST ( " 1 . " ) , 
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170 byte_in, 

171 (byte_in)xor (mask) ) ; 

172 sync_out<=CAST ("1. " ) ; 

173 h3rackreg<= h3rack; 

5 174 h3rreq<= CAST("0 . " ) ; 

175 next_state<= phase2; 

176 end if; 
177 

178 if (not (ToBool (h3rackreg_atl) ) ) then 

10 179 h3rreq<= CAST("1 . " ) ; 

180 h3rackreg<= h3rack; 

181 next_state<= phasel; 

182 end if; 
183 

15 184 

185 when phase2=> 
186 

187 h3rackreg<= h3rack; 

188 sync_out<= CAST("0 . " ) ; 
20 189 h3rreq <= CAST("1 . " ) ; 

190 next_state<= phasel; 

191 

192 

193 when inireg_derand=> 
25 194 

195 seed <= CAST ("0000000000000000. " ) ; 

196 shiftreg<= CAST ( "0000000000000000 . " ) 

197 bypass <= CAST("0 . " ) ; 

198 byte_out<= CAST ( "00000000 . " ) ; 
30 199 sync_out<= CAST("0 . " ) ; 

200 next_state<= rst; 
201 





325 

202 

203 when others-> 

2 04 next_state<= current_state; 

2 05 end case; 
5 206 

207 if (reset = '1' ) then 

208 next_state<= inireg_derand; 

209 seed <= CAST ("0000000000000000. " ) ; 

210 shiftreg <= CAST(" 0000000000000000. " ) 
10 211 bypass <= CAST ("0. " ) ; 

212 h3rackreg<= CAST("0 . " ) ; 

213 byte_out<- CAST(" 00000000. " ) ; 

214 sync_out<= CAST("0 . " ) ; 

215 end if; 
15 216 

217 

218 end process; 
219 

220 end RTL; 



20 



6.16 vhdl/derandjproc_STD . vhd 



25 2 --OCAPI - alpha release- generatedThu Jun 11 14:57:23 
1998 

3 - includes sfg 

4 -- derandrstphaselO 

5 derandphaselphase20 
30 6 derandphaselphasell 

7 derandphase2phasel0 

8 derandinireg_derandrstO 
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9 



10 

11 library IEEE; 
5 12 use IEEE. std_logic_1164, all; 

13 use IEEE. std__logic.arith. all; 

14 library FXT_PNT_LIB; 

15 use FXT_PNT_LIB.pck_f ixed_j)oint .all; 
16 

10 17 entity derand is 

18 port( 

19 elk : in std_logic; 

20 reset: in std_logic; 

21 h3rack : in std_logic; 
15 22 syncro: in std_logic; 

23 byte_in: in std_logic_vector (7 
downto 0) ; 

24 h3rreq: out std_logic; 

25 h3rackreg: out std_logic; 

20 26 byte_out:out std_logic_vector (7 

downto 0) ; 

27 sync_out:out std_logic 

28 ) ; 

29 end derand; 
25 30 

31 architecture structure of derand is 
32 

33 component derand_jproc 

34 port ( 

30 35 elk : in std_logic; 

36 reset: in std_logie; 

37 h3rack : in FX (0 downto 0) ; 
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38 syncro : in FX (0 dovmto 0) ; 

3 9 byte_in : in FX (7 downto 0) ; 

40 h3rreq : out FX (0 downto 0) ; 

41 h3rackreg_reg:outFX (0 downto 0) ; 
5 42 byte_out_reg:outFX(7 downto 0) ; 

43 sync_out_reg :outFX (0 downto 0) 

44 ) ; 

45 endcomponent ; 



46 



10 47 signal FX_h3rack : FX( 0 downto 0) ; 

48 signal FX_syncro : FX( 0 downto 0) ; 

4 9 signal FX_byte_in : FX (7 downto 0) ; 

50 signal FX_h3rreq : FX ( 0 downto 0) ; 

51 signal FX_h3rackreg :FX (0 downto 0) ; 
15 52 signal FX_byte_out :FX (7 downto 0) ; 

53 signal FX_sync_out :FX (0 downto 0) ; 
54 

55 begin 
56 

20 57 FX_h3rack(0) <=h3rack; 

58 FX_syncro(0) <=syncro; 

59 FX_byte_in<= FX (SIGNED (byte_in) ) ; 

60 h3rreq<= FX_h3rreq (0) ; 

61 h3rackreg<= FX_h3rackreg (0) ; 

25 62 byte_out<=CONV_STD_LOGIC_VECTOR 

(ToSigned (FX_byte_out ) , byte_out ' length) ; 

6 3 sync_ou t < = FX_sync_ou t ( 0 ) ; 
64 

65 derand: derand_jproc 

30 66 port map ( 

67 elk => elk, 

68 reset => reset. 
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69 h3rack => FX_h3rack, 

70 syncro => FX_syncEO, 

71 byte_in=> FX_byte_in, 

72 h3rreq => FX_h3rreq, 

5 73 h3rackreg_reg=> FX_h3rackreg, 

74 byte_out_reg=>FX_byte_out , 

7 5 sync_ou t_reg= > FX_sync_out 

76 ) ; 
77 

10 78 

79 end structure; 



6.17 vhdl /derand_tb . vhd 
15 1 



2 --OCAPI -alpha release-generated Fri Jun 12 16:45:45 1998 
3 

20 4 

5 TestBench for design derand 
6 

7 library IEEE; 

8 use IEEE. std_logic_1164 .all; 
25 9 

10 use IEEE. std_logic_textio. all; 

11 use std. text io. all ; 
12 

13 library clock; 
30 14 use clock. clock. all ; 
15 

16 entity derand_tb is 
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17 end derand_tb; 
18 

19 architecture rtl of 
20 

5 21 signal 

22 signal 

23 signal 

24 signal 
downto 0) ; 

10 25 signal 

26 signal 

27 signal 
2 8 signal 
downto 0) ; 

15 29 signal 



derand_tb is 

reset 
elk 
h3rack 



component derand 
port ( 



30 
31 
32 
33 

20 34 

35 
36 

downto 0) ; 
37 

25 38 

39 

downto 0) ; 
40 

41 ) ; 
3 0 42 end component; 
43 
44 



std_logic; 
std_logic; 
std_logic; 



byte_in : std_logic_vector ( 7 

syncro : std_logic; 
h3rreq : std_logic; 
h3rackreg : std_logic; 

byte_out : std_logic_vector ( 7 

sync_out : std_logic ; 



reset: in std_logic; 
elk: in std_logic; 
h3rack: in std_logic; 

byte_in: in std_logic_vector (7 

syncro: in std_logic; 
h3rreq: out std_logic; 

by t e_out : out s td_logic_vec tor ( 7 

sync__out: out std_logic 
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Eli 



45 begin 
46 



crystal (elk, 50 ns) 



derand_dut : derand 
port map ( 



47 
48 
5 49 
50 
51 

reset, 
52 

10 elk, 
53 

h3rack, 
54 

byte_in, 
15 55 

syncro, 
56 

h3rreq, 
57 

20 byte_out , 
58 

sync_out ) ; 
59 ini :process 
begin 



60 
25 61 
62 
63 
64 
65 

30 66 
67 
68 



reset=> 
clk=> 
h3rack=> 
byte_in=> 
syncro=> 
h3rreq=> 
byte_out=> 
sync_out=> 



reset<= *1' ; 
wait until elk 'event and elk = '1* ; 

reset<= '0' ; 
wait ; 
end process; 

input : process 

file stimuli: text is in "derand tb.dat"; 
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69 variable aline : line; 
70 

71 file stimulo: text is out "derand_tb. sim_out " ; 

72 variable oline : line; 
5 73 

74 variable v_h3rack: std_logic; 

75 variable v_byte_in: std_logic_vector (7 
downto 0) ; 

76 variable v_syncro: std_logic; 
10 77 variable v_h3rreq: std_logic; 

78 variable v_byte_out: std_logic_vector (7 
downto 0) ; 

79 variable v_sync_out: std_logic; 

80 variable v_h3rack_hx: std_logic; 

15 81 variable v_byte_in_hx: std_logic_vector (7 

downto 0) ; 

82 variable v_syncro_hx: std_logic; 

83 variable v_h3rreq_hx: std_logic; 

84 variable v_byte_out_hx: std_logic_vector (7 
20 downto 0) ; 

85 variable v_sync_out_hx : std_logic; 
86 

87 begin 

88 wait until reset 'event and reset = '0' ; 
25 89 loop 

90 if (not (endfile (stimuli) )) then 

91 readline (stimuli, aline); 

92 read(aline, v_h3rack) ; 

93 read(aline, v_byte_in) ; 
30 94 read(aline, v_syncro) ; 

95 else 

96 assert false 



# # 



332 







Q7 


Tf^noTt" " Enri of innutfilG 

X L/V— / X, L> IJJ 1 L\Jk \y JL -X A 1 U -I- -1. w 






98 


severity warning; 






99 


end if; 






100 






5 


101 


h3rac)c <= v hBrack; 






X V 4b 


bvte in<= V bvte in: 






103 

-L V/ -J 


svncro <= v svncro; 






104 








105 


wait for 50 ns ; 
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107 


V li3rrecr'= h3rrGQ; 






108 


v_byte_out : =byte_out ; 


n 




109 


V sync out : =sync out ; 






110 






15 


111 


V h3rack hx:=v h3rack; 


w 
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hwri te { ol ine . v bvte in) hx; 
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124 


write (oline , v_h3rreq__hx) ; 






125 


write (oline, ' ' ) ; 




30 


126 


hwrite (oline, v_byte_out ) _hx; 






127 


write (oline, ' V ) ; 






128 


write (oline, v_sync_out)_hx; 
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129 write (oline, * ' ) ; 
130 

131 writeline (stimulo, oline); 
132 

5 133 wait until elk 'event and elk = '1' ; 
134 

135 end loop; 

13 6 end process; 

137 end rtl; 
10 138 

139 configuration tbc_rtl of derand_tb is 

140 for rtl 

141 for all : derand 

142 use entity work. derand (structure) ; 
15 143 end for; 

144 end for; 

145 end tbc rtl; 



